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Abstract 

Fomiation of galaxy clusters corresponds to the collapse of the largest gravitationally bound 
overdensities in the initial density field and is accompanied by the most energetic phenom- 
ena since the Big Bang and by the complex interplay between gravity-induced dynamics 
of collapse and baryonic processes associated with galaxy formation. Galaxy clusters are, 
thus, at the cross-roads of cosmology and astrophysics and are unique laboratories for test- 
ing models of gravitational structure formation, galaxy evolution, thermodynamics of the 
intergalactic medium, and plasma physics. At the same time, their large masses make them 
a useful probe of growth of structure over cosmological time, thus providing cosmological 
constraints that are complementary to other probes. In this review, we describe our cur- 
rent understanding of cluster formation: from the general picture of collapse from initial 
density fluctuations in an expanding Universe to detailed simulations of cluster formation 
including the effects of galaxy formation. We outline both the areas in which highly ac- 
curate predictions of theoretical models can be obtained and areas where predictions are 
uncertain due to uncertain physics of galaxy formation and feedback. The former includes 
the description of the structural properties of the dark matter halos hosting cluster, their 
mass function and clustering properties. Their study provides a foundation for cosmolog- 
ical applications of clusters and for testing the fundamental assumptions of the standard 
model of structure formation. The latter includes the description of the total gas and stellar 
fractions, the thermodynamical and non-thermal processes in the intracluster plasma. Their 
study serves as a testing ground for galaxy formation models and plasma physics. In this 
context, we identify a suitable radial range where the observed thermal properties of the 
infra-cluster plasma exhibit the most regular behavior and thus can be used to define robust 
observational proxies for the total cluster mass. Finally, we discuss the formation of clusters 
in non-standard cosmological models, such as non-Gaussian models for the initial density 
field and models with modified gravity, along with prospects for testing these alternative 
scenarios with large cluster surveys in the near future. 
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1. Introduction 



Tendency of nebulae to cluster has been ddiscovered by Charles Messier and William Her- 
schel, who have constructed the first systematic catalogs of these objects. This tendency 
has become more apparent as larger and larger samples of galaxies were compiled in the 
19th and early 20th centuries. Studies of the most prominent concentrations of nebulae, the 
clusters of galaxies, were revolutionized in the 1920s by Edwin Hubble's proof that spiral 
and elliptical nebulae were bona fide galaxies like the Milky Way located at large distances 



from us dHubble 1925[ , |1926| ), which implied that clusters of galaxies are systems of enor- 
mous size. Just a few years later, measurements of galaxy velocities in regions of clusters 



made by [Hubble & Humason (1931[ ) and assumption of the virial equilibrium of galaxy 
motions were used to show that the total gravitating cluster masses for the Coma (Zwicky 



1933, and see also Zwicky 1937) and Virgo clusters ( Smith 1936[ ) were enormous as well. 



The masses implied by the measured velocity dispersions were found to exceed combined 
mass of all the stars in clusters galaxies by factors of ~ 200 - 400, which prompted 
Zwicky to postulate the existence of large amounts of "dark matter" (DM), inventing this 
widely used term in the process. Although the evidence for dark matter in clusters was 
disputed in the subsequent decades, as it was realized that stellar masses of galaxies were 
underestimated in the early studies, dark matter was ultimately confirmed by the discov- 
ery of extended hot intracluster medium (ICM) emitting at X-ray energies by thermal 
bremsstrahlung that was found to be smoothly filling intergalactic space within the Coma 
cluster ( Cavaliere, Gursky & Tucker 197 1|; porman et al. 1972 ; Gursky et al. 1971 ; Kellogg 



et al. 1972; Meekins et al. 1971). The X-ray emission of the ICM has not only provided 
a part of the missing mass (as was conjectured on theoretical grounds by Limber 1959| , 
van Albada 1960), but also allows the detection of clusters out to z > 1 Rosati, Borgani 
& Norman (2002). Furthermore, measurement of the ICM temperature has provided an 
independent confirmation that the depth of gravitational potential of clusters requires ad- 
ditional dark component. It was also quickly realized that inverse Compton scattering of 
the cosmic microwave background (CMB) photons off thermal electrons of the hot inter- 
galactic plasma should lead to distortions in the CMB spectrum, equivalent to black body 
temperature variations of about 10~^-10"^ [the Sunyaev-Zel'dovich (SZ) effect;Sunyaev & 
Zeldovich 1980, 1970 , 1972b]. This effect has now been measured in hundreds of clusters 
(e.g., [Carlstrom, Holder & Reese 2002 ). 



Given such remarkable properties, it is no surprise that the quest to understand the forma- 
tion and evolution of galaxy clusters has become one of the central efforts in modern astro- 
physics over the past several decades. Early pioneering models of collapse of initial density 
fluctuations in the expanding Universe have shown that systems resembling the Coma clus- 
ter c an indeed form (Peebles 197q, [van Albada 196q, [l96l| [White 1976|). G ott & Gunn 
(1971, see also Sunyaev & Zeldovich 1972aD showed that hot gas observed in the Coma 
via X-ray observations can be explained within such a collapse scenario by heating of the 
infalling gas by the strong accretion shocks. Subsequently, emergence of the hierarchical 
mod el of structure formation (pott & Rees 1975, press & Schechter 1974 , White & Rees 
1978), combined with the cold dark matter (CDM) cosmological scenario (Blumenthal et al. 
1984; Bond, Szalay & Turner 1982), provided a powerful framework for interpretation of 
the multi-wavelength cluster observations. At the same time, rapid advances in comput- 
ing power and new, efficient numerical algorithms have allowed fully three-dimensional ab 
initio numerical calculations of cluster formation within self-consistent cosmological con- 
text in both dissipationless regime ( Efstathiou et al. 1985 , Klypin & Shandarin 1983 ) and 
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1 INTRODUCTION 



including dissipational baryonic component (Evrard 1988, 1990) 



In the past two decades, theoretical studies of cluster formation have blossomed into a vi- 
brant and mature scientific field. As we detail in the subsequent sections, the standard sce- 
nario of cluster formation has emerged and theoretical studies have identified the most im- 
portant processes that shape the observed properties of clusters and their evolution, which 
has enabled usage of clusters as powerful cosmological probes (see, e.g., Allen, Evrard & 
Mantz 201 1, for a recent review). At the same time, observations of clusters at different red- 
shifts have highlighted several key discrepancies between models and observations, which 
are particularly salient in the central regions (cores) of clusters. 

In the current paradigm of structure formation clusters are thought to form via an hierar- 
chical sequence of mergers and accretion of smaller systems driven by gravity and DM 
that dominates the gravitational field. Theoretical models of clusters employ a variety of 
techniques determined by a particular aspect of cluster formation they aim to understand. 
Many of the bulk properties of clusters are thought to be determined solely by the initial 
conditions, dissipationless DM that dominates cluster mass budget, and gravity. Thus, clus- 
ter formation is often approximated in models as DM-driven dissipationless collapse from 
cosmological initial conditions in an expanding Universe. Such models are quite successful 
in predicting the existence and functional form of correlations between cluster properties, 
as well as their abundance and clustering, as we discuss in detail in Section 3. One of 
the most remarkable models of this kind is a simple self-similar model of clusters (Kaiser 
1986, see § 3.9 below). Despite its simplicity, the predictions of this model are quite close 
to results of observations and have, in fact, been quite useful in providing baseline expecta- 
tions for evolution of cluster scaling relations. Studies of abundance and spatial distribution 
of clusters using dissipationless cosmological simulations show that these statistics retain 
remarkable memory of the initial conditions. 

The full description of cluster formation requires detailed modeling of the non-linear pro- 
cesses of collapse and the dissipative physics of baryons. The gas is heated to high. X-ray 
emitting temperatures by adiabatic compression and shocks during collapse and settles in 
hydrostatic equilibrium within the cluster potential well. Once the gas is sufficiently dense, 
it cools, the process that can feed both star formation and accretion onto supermassive black 
holes (SMBHs) harbored by the massive cluster galaxies. The process of cooling and for- 
mation of stars and SMBHs can then result in energetic feedback due to supernovae (SNe) 
or active galactic nuclei (AGN), which can inject substantial amounts of heat into the ICM 
and spread heavy elements throughout the cluster volume. 

Galaxy clusters are therefore veritable crossroads of astrophysics and cosmology: While 
abundance and spatial distribution of clusters bear indelible imprints of the background 
cosmology, gravity law, and initial conditions, the nearly closed-box nature of deep cluster 
potentials makes them ideal laboratories to study processes operating during galaxy forma- 
tion and their effects on the surrounding intergalactic medium. 

In this review we discuss the main developments and results in the quest to understand the 
formation and evolution of galaxy clusters. Given the limited space available for this review 
and the vast amount of literature and research directions related to galaxy clusters, we have 
no choice but to limit the focus of our review, as well as the number of cited studies. Specif- 
ically, we focus on the most basic and well-established elements of the standard paradigm 
of DM-driven hierarchical structure formation within the framework of ACDM cosmology 
as it pertains to galaxy clusters. We focus mainly on the theoretical predictions of the prop- 
erties of the total cluster mass distribution and properties of the hot intracluster gas, and 
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only briefly discuss results pertaining to the evolution of stellar component of clusters, un- 
derstanding of which is still very much a work in progress. Comparing model predictions 
to real clusters, we mostly focus on comparisons with X-ray observations, which have pro- 
vided the bulk of our knowledge of ICM properties so far. In § 5, we briefly discuss the 
differences in formation of clusters in models with the non-Gaussian initial conditions and 
modified gravity. Specifically, we focus on the information that statistics sensitive to the 
cluster formation process, such as cluster abundance and clustering, can provide about the 
primordial non-Gaussianity and possible deviations of gravity from General Relativity. We 
refer readers to recent extensive reviews on cosmological uses of galaxy clusters by Allen, 
Evrard & Mantz (2011) and Weinberg et al. (2012) for a more extensive discussion of this 
topic. 



2. The observed properties of galaxy clusters 



Observational studies of galaxy clusters have now developed into a broad, multi-faceted and 
multi-wavelength field. Before we embark on our overview of different theoretical aspects 
of cluster formation, we briefly review the main observational properties of clusters and, in 
particular, the basic properties of their main matter constituents. 

Figure 1 shows examples of the multiwavelength observations of two massive clusters at 
two different cosmic epochs: the AbeU 1689 at z ^ 0.18 and the SPT-CL J2106-5844 at 
z = 1.133. It illustrates all of the main components of the clusters: the luminous stars in 
and around galaxies (the intracluster light or ICL), the hot ICM observed via its X-ray emis- 
sion and the Sunyaev-Zel'dovich effect and, in the case of AbeU 1689, even the presence of 
invisible DM manifesting itself through gravitational lensing of background galaxies dis- 



torting their images into extended, cluster-centric arcs (Bartelmann 2010, and references 
therein). At larger radii, the lensing effect is weaker. Although not easily visible by eye, it 
can still be reliably measured by averaging the shapes of many background galaxies and 
comparing the average with the expected value for an isotropic distribution of shapes. The 
gravitational lensing is a direct probe of the total mass distribution in clusters, which makes 
it both extremely powerful in its own right and a very useful check of other methods of mea- 
suring cluster masses. The figure shows several bright elUptical galaxies that are typically 
located near the cluster center. A salient feature of such central galaxies is that they show 
little evidence of ongoing star formation, despite their extremely large masses. 

The diffuse plasma is not associated with individual galaxies and constitutes the intra- 
cluster medium, which contains the bulk of the normal baryonic matter in massive clusters. 
Although the hot ICM is not directly associated with galaxies, their properties are corre- 
lated. For example. Fig. 2 shows the mass of the ICM gas within the radius /?5oo> defined 
as the radius enclosing mean overdensity of = 500pcr, versus stellar mass in galaxies 
within the same radius for a number of local (z < 0.1) and distant (0.1 < z < 0.6) clusters 
( pn et al. 2012 ). Here Pcr(z) - 3H{z)^ /{SnG is the critical mean density of the Universe, 



defined in terms of the Hubble function H(z). The figure shows a remarkably tight, al- 
beit non-linear, correlation between these two baryonic components. It also shows that the 
gas mass in clusters is on average about ten times larger than the mass in stars, although 
this ratio is systematically larger for smaller mass clusters, ranging from M^/Mg w 0.2 to 
w 0.05, as mass increases from group scale {M500 ~ few x 10^^^ Mq) to massive clusters 
(M500 ~ 1015 Mo). 
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2 THE OBSERVED PROPERTIES OF GALAXY CLUSTERS 




Fig. 1 . Left panel: the composite X-ray/optical image (556 kpc on a side) of the galaxy cluster Abell 
1689 at redshift z = 0.18. The purple haze shows X-ray emission of the T ~ 10** K gas, obtained 
by the Chandra X-ray Observatory. Images of galaxies in the optical band, colored in yellow, are 
from observations performed with the Hubble Space Telescope. The long arcs in the optical image 
are caused by the gravitational lensing of background galaxies by matter in the galaxy cluster, the 
lai-gest system of such arcs ever found (Credit:X-ray: NASA/CXC/MIT; Optical: NASA/STScI). 
Right panel: the galaxy cluster SPT-CL J2 106-5844 at z = 1.133, the most massive cluster known at 
z > 1 discovered via its Sunyaev-Sel'dovich (SZ) signal (M200 ~ 1.3 x 10'^ M©). The color image 
shows the Magellan/LDSS3 optical and Spitzer/IRAC mid-infrared measurements (corresponding 
to the blue-green-red color channels). The frame subtends 4.8 x 4.8 arcmin, which corresponds 
to 2.4 X 2.4 Mpc at the redshift of the cluster The white contours correspond to the South Pole 
Telescope SZ significance value s, as labeled, whe re dashed contours are used for the negative sig- 
nificance values. (Adapted from Foley et al. 201 1 ). 



The temperature of the ICM is consistent with velocities of galaxies and indicates that both 
galaxies and gas are nearly in equilibrium within a common gravitational potential well. 
The mass of galaxies and hot gas is not sufficient to explain the depth of the potential well, 
which implies that most of the mass in clusters is in a form of DM. Given that hydrogen 
is by far the most abundant element in the Universe, most of the plasma particles are elec- 
trons and protons, with a smaller number of helium nuclei. There are also trace amounts of 
heavier nuclei some of which are only partially ionized. The typical average abundance of 
the heavier elements is about one-third of that found in the Sun or a fraction of one per cent 
by mass; it decreases with increasing radius and can be quite inhomogeneous, especially in 
merging systems ( [Werner et al. 2008" . for a review). 



Thermodynamic properties of the ICM are of utmost importance, because comparing such 
properties to predictions of baseline models without cooling and heating can help to isolate 
the impact of these physical processes in cluster formation. The most popular baseline 
model is the self-similar model of clusters developed by [Kaiser (1986 ), which we consider 



in detail in Section 3.9 below. In its simplest version, this model assumes that clusters are 
scaled versions of each other, so that gas density at a given fraction of the characteristic 
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Fig. 2. The mass in stars vs. the mass of hot. X-ray emitting gas. Both masses are measured within 
the radius R^qq estimated from the observationally calibrated Fx - A^soo relation, assuming flat 
ACDM cosmology with Q.^ = 1 - Q.\ = 0.26 and h - 0.71. Red circles show local clusters 
located at z < 0.1, whereas magenta squares show higher-redshift clusters: 0.1 < z < 0.6 (see 



Lin et al. 2012, for details). The dotted line corresponds to the constant stellar-to-gas mass ratio 



A^*,50o/A^g,500 = 0.1, whereas the dashed lines correspond to the values of 0.05 and 0.2 for this ratio. 



radius of clusters, defined by their mass, is independent of cluster mass. Figure 3 shows the 
electron density in clusters as a function of ICM temperature (and hence mass) at different 
radii. It is clear that density is independent of temperature only outside cluster core at 
r ~ /?5oo> although there is an indication that density is independent of temperature at 
- ^2500 for ksT > 3 keV. This indicates that processes associated with galaxy formation 
and feedback affect the properties of clusters at r < /?2500. but their effects are mild at larger 
radii. 

During the past two decades, it has been established that the core regions of the relaxed 
clusters are generally characterized by a strongly peaked X-ray emissivity, indicating effi- 
cient cooling of the gas (e.g., pabian 1994 ). Quite interestingly, spectroscopic observations 



with the Chandra and XMM-Newton satellites have demonstrated that, despite strong X- 
ray emission of the hot gas, only a relatively modest amount of this gas cools down to 



low temperatures (e.g., Bohringer et al. 200 1[ Peterson et al. 2001). This result is generally 
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2 THE OBSERVED PROPERTIES OF GALAXY CLUSTERS 




Fig. 3. The observed electron number density, tie, in galaxy clusters and groups, measured at dif- 
ferent radii (from top to bottom: 0.15/?50(), /?2500, ^50o; see labels) as a function of the intracluster 
medium temperature at R^qq. The values of rie are rescaled by E^^{z), the scaling expected from the 
definition of the radii at which densities are measured. Squ ares and circles show sy stems observed 
with the Chandra X-ray Observatory from the studies by Vikhlinin et al. (2009a) and Sun et al . 
(2009), triangles show systems observed with the XMM-Newton telescope by [Pratt et al. (2010 ). 
Note that electron densities at large radii are independent of temperature, as expected from the selr- 
-similar model, whereas at small radii the rescaled densities increase with temperature. Note also 
that the scatter fr om cluster to cluster increases with decreasing radius, especially for low-tempera- 
ture groups (after Sun 2012). 



consistent with the low levels of star formation observed in the brightest cluster galaxies 



(BCGs; e.g., [McDonald et al. 201 1[ ). It implies that a heating mechanism should compen- 
sate for radiative losses, thereby preventing the gas in cluster cores to cool down to low 
temperature. The presence of cool cores is also reflected in the observed temperature pro- 
files (e.g. >:.eccardi & Molendi 2008] pratt et al. 20"07| , [Vikhhnin et al. 2006 , see also Figure 
4), which exhibit decline of temperature with decreasing radius in the innermost regions of 
relaxed cool-core clusters. 



One of the most important and most widely studied aspects of ICM properties are cor- 
relations between its different observable integrated quantities and between observable 
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Fig. 4. Comp arison between temperature profiles, normalized to the global temperature measured 
within /?iso by Leccardi & Molendi (2008 ), for a set of about 50 nearby clusters with z$ 0.3 and with 
temperature kgTx > 3 keV, observed with the XMM-Newton (dots with errorbars) and results from 
cosmological hydrodynamical simulations including the effect of radi ative cooling, star for mation 
and supernova feedback in the form of galactic winds (solid curve; Borgani et al. 2004 ). From 
Leccai-di & Molendi (20081). 



quantities and total mass. Such scaling relations are the key ingredient in cosmological 
uses of clusters, where it is particularly desirable that the relations are characterized by 
small scatter and are independent of the relaxation state and other properties of clusters. 
Although clusters are fascinatingly complex systems overall, they do exhibit some remark- 
able regularities. As an example, Figure 5 shows the correlation between the bolometric 
luminosity emitted from within R500 and the Fx parameter defined as a product of gas mass 
within R500 and ICM temperature derived from the X-ray spectrum within the radial range 
(0.15 - l)/?5oo dKravtsov, Vikhlinin & Nagai 200^ ) for theRepresentat ive XMM-Newto n 
Cluster Structure Survey (REXCESS) sample of clusters studied by Pratt et al. (2009 ). 
Different symbols indicate clusters in different states of relaxation, whereas clusters with 
strongly peaked central gas distribution (the cool core clusters) and clusters with less cen- 
trally concentrated gas distribution are shown with different colors. The left panel shows 
total luminosity integrated within radius R500, wheareas the right panel shows luminosity 
calculated with the central region within 0.15/?5oo excised. Quite clearly, the core-excised 
X-ray luminosity exhibits remarkably tight correlation with Yx, which, in turn, is expected 
to co rrelate tightly with total cluster mass (pabjan et al. 201 1 ; Kravtsov, Vikhlinin & Nagai 
2006; |Staneketal. 2010D . This figure illustrates the general findings in the past decade that 
clusters exhibit strong regularity and tight correlations among X-ray observable quantities 
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Fig. 5. Correlation of bolometric luminosity of intracluster gas and ix = Mg^^Tx, where Mgas is the 
mass of the gas within R^qq and Tx is temperature derived from the fit to gas spectrum accounting 
only for emission from radial range (0.15 - l)/?5oo- Results are shown for the local clusters from 



the Representative XMM-Newton Cluster Structure Survey sample of Pratt et al. (2009). The left 
panel shows total luminosity integrated within radius R500, whereas the right panel shows bolometric 
luminosity calculated with the central O.lSTJgoo of the cluster excised. Labels in the top left comer 
indicate the radial range used in computing the luminosity and logarithmic scatter of luminosity 
at fixed Yx- The blue points show cool core clusters, whereas magenta points are non-cool core 
clusters. Clusters classified as relaxed and disturbed are shown by circles and squares, respectively. 
Note that exclusion of the cluster cores reduces the scatter between luminosity and Yx by more than 
a factor of two. 



and total mass, provided that relevant quantities are measured after excluding the emission 
from cluster cores. 



3. Understanding the formation of galaxy clusters 



3.1 Initial density perturbation field and its linear evolution 



In the currently standard hierarchical structure formation scenario, objects are thought to 
form via gravitational collapse of peaks in the initial primordial density field character- 
ized by the density contrast (or overdensity) field: S{x) - (p(x) - Pm)/Pm, where pm is the 
mean mass density of the Universe. Properties of the field d{x) depend on specific details 
of the processes occurring during the earliest inflationary stage of evolution of the Uni- 



verse (Bardeen, Steinhardt & Turner 1983; Guth & Pi 1982; Starobinsky 1982) and the 



subsequent stages prior to recombination (Bardeen et al. 1986, Bond & Efstathiou 1984, 



Eisenstein & Hu 1999[ , [Peebles 1982| ). A fiducial assumption of most models that we dis- 
cuss is that (5(x) is a homogeneous and isotropic Gaussian random field. We briefly discuss 
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3. 1 Initial density perturbation field and its linear evolution 



non-Gaussian models in Section 5.1. 

Statistical properties of a uniform and isotropic Gaussian field can be fully characterized 
by its power spectrum, P{k), which depends only on the modulus k of the wavevector, but 
not on its direction. A related quantity is the variance of the density contrast field smoothed 
on some scale R: 6r{x) = j S{x - r)W{r,R)d^r, where 

<4> = cr\R) = r P{k)mKR)\^dh, (1) 

where W{k,R) is the Fourier transform of the window (filter) function W{r,R), such that 
6R{k) ^ 6{k)W{k,R) [see, e.g., [Zentner (20d7| ) or |Mo, van den Bosch & White (2010D for 



details on the definition of P{k) and choices of window function]. For the cases, when one 
is interested in only a narrow range of k the power spectrum can be approximated by the 
power-law form, P{k) <x /c", and the variance is cr^iR) <x /?~^"+^\ 

At a sufficiently high redshift z, for the spherical top-hat window function mass and radius 
are interchangeable according to the relation M = 47r/3pm(z)/?^. We can think about the 
density field smoothed on the scale R or the corresponding mass scale M. The characteristic 
amplitude of peaks in the Sr (or Sm) field smoothed on scale R (or mass scale M) is given 
by o-{R) = cr{M). The smoothed Gaussian density field is, of course, also Gaussian with 
the probability distribution function (PDF) given by 

p(dM) = ^ exp f-:r4^1 ■ (2) 



During the earliest linear stages of evolution in the standard structure formation scenario 
the initial Gaussianity of the J(x) field is preserved, as different Fourier modes (5(k) evolve 
independently and grow at the same rate, described by the linear growth factor, D^{a), as 
a function of expansion factor a - (1 + z)~\ which for a ACDM cosmology is given by 



Heath (1977D : 

5n„, ^ r da' 

—E{a) 

2 Jo [a'E{a')]^ 



6{a) cc D+{a) = ^^E{a) f — — -, (3) 

Jo [aT' " 

where E(a) is the normalized expansion rate, which is given by 

H(a) ft nl/2 

Eia) = -^ = + (1 - - QA)a-2 + O^] , (4) 



if the contribution from relativistic species, such as radiation or neutrinos, to the energy- 
density is neglected. Growth rate and the expression for E{a) in more general, homogeneous 



dark energy (DE) cosmologies are described by Percival (2005). Note that in models in 



which DE is clustered (Alimi et al. 2010) or gravity deviates from General Relativity (GR, 



see Section 5.2), the growth factor can be scale dependent. 

Correspondingly, the linear evolution of the root mean square (rms) amplitude of fluctua- 
tions is given by o"(M, a) = cr{M, ai)D+{a)/D+{ai), which is often useful to recast in terms 
of linearly extrapolated rms amplitude o"(M, a = 1) at a = 1 (i.e., z = 0): 

cr(M,a) = cr{M,a = l)Z)+o(a), where D+o{a) = D+{a)/D+{a = 1). (5) 
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3 UNDERSTANDING THE FORMATION OF GALAXY CLUSTERS 



Once the amplitude of typical fluctuations approaches unity, cr{M,a) ~ 1, the linear ap- 
proximation breaks down. Further evolution must be studied by means of nonlinear models 
or direct numerical simulations. We discuss results of numerical simulations extensively 
below. However, we consider first the simplified, but instructive, spherical collapse model 
and associated concepts and terminology. Such model can be used to gain physical insight 
into the key features of the evolution and is used as a basis for both definitions of col- 
lapsed objects (see Section 3.6) and quantitative models for halo abundance and clustering 
(Section 3.7 and 3.8). 



3.2 Non-linear evolution of spherical perturbations and non-linear mass scale 

The simplest model of non-linear collapse assumes that density peak can be characterized 
as constant overdensity spherical perturbation of radius R. Despite its simplicity and limita- 
tions discussed below, the model provides a useful insight into general features and timing 
of non-linear collapse. Its results are commonly used in analytic models for halo abun- 
dance and clustering and motivate mass definitions for collapsed objects. Below we briefly 
describe the model and non-hnear mass scale that is based on its predictions. 



3.2.1 Spherical collapse model. 

The spherical collapse model considers a spherically-symmetric density fluctuation of ini- 
tial radius Ri, amplitude 6i > 0, and mass M = (4;r/3)(l -i- di)pR^, where Ri is physical 
radius of the perturbation and p is the mean density of the Universe at the initial time. 
Given the symmetry, the collapse of such perturbation is a one-dimensional problem and 



is fully specified by evolution of the top-hat radius R{t) (Gunn & Gott 1972, Lahav et al. 



1991). It consists of an initially decelerating increase of the perturbation radius, until it 
reaches the maximum value, /?ta, at the turnaround epoch, tta, and subsequent decrease of 
R{t) att> ?ta until the perturbation collapses, virializes, and settles at the final radius Rf at 
t = fcoii- Physically, Rf is set by the virial relation between potential and kinetic energy and 
is Rf = /?ta/2 in cosmologies with Qa - 0. The turnaround epoch and the epoch of collapse 
and virialization are defined by initial conditions. 

The final mean internal density of a collapsed object can be estimated by noting that in a 
Qa = Universe the time interval fcoii - ha. = ha should be equal to the free-fall time of 
a uniform sphere tg - ^j3nJ(32Gp^, which means that the mean density of perturbation 
at turnaround is pta = ^^/O'^Gt^J and pcoii = 8pta = 3?r/(G?^^jj). These densities can be 
compared with background mean matter densities at the corresponding times to get mean 
internal density contrasts: A = p/pm- In the Einstein-de Sitter model (Dm = 1> ^^a - 0), 
background density evolves as pm = l/i6nGf), which means that density contrast after 
virialization is 

Avir = ^ = IS/r^ = 177.653. (6) 

Pro 

For general cosmologies, density contrast can be computed by estimating pcoii and Pmihoi]) 
in a similar fashion. For lower Q.^ models, fluctuation of the same mass M and 6 has a larger 
initial radius and smaller physical density and, thus, takes longer to collapse. The density 
contrasts of collapsed objects therefore are larger in lower density models because the mean 
density of matter at the time of collapse is smaller. Accurate (to < 1% for Q.^ = 0.1-1) 
approximations for Ayix in open (Qa - 0) and flat ACDM (1 - Oa - Qm - 0) cosmologies 
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are given by Bryan & Norman (1998 , their equation 6). For example, for the concordance 
ACDM cosmology with = 0.27 and Qa - 0.73 ( Komatsu et al. 2011 ), density contrast 
at z ^ is Avir X 358. 

Note that if the initial density contrast 6i would grow only at the linear rate, D+(z), then 
the density contrast at the time of collapse would be more than a hundred times smaller. 
Its value can be derived starting from the density contrast linearly extrapolated to the turn 
around epoch, 5ta- This epoch corresponds to the time at which perturbation enters in the 
non-linear regime and detaches from the Hubble expansion, so that du ~ 1 is expected. 
In fact, the exact calculation in the case of Qm(z) = 1 at the redshift of turn-around gives 
Sta = 1.062 ( punn & Gott 1972 ). Because fcoii = 2fta, further linear evolution for OmCz) = 1 
until the collapse time gives 6c = 6taD+{tc)/D+{tta) ~ 1.686. In the case of Qm I we 
expect that du should have different values. For instance, for < 1 density contrast 
at tum-around should be higher to account for the higher rate of the Hubble expansion. 
However, linear growth from fta to fcoii is smaller due to the slower redshift dependence of 
D+(z). As a matter of fact, these two factors nearly cancel, so that dc has a weak dependence 
on Oni and Oa (e.g., Percival 2005 ). For the concordance ACDM cosmology at z = 0, for 
example, 6c ~ 1.675. 

Additional interesting effects may arise in models with DE characterized by small or zero 
speed of sound, in which structure growth is affected not only because DE influences linear 
growth, but also because it participates non-trivially in the collapse of matter and may slow 
down or accelerate the formation of clusters of a given mass depending on DE equation of 



state ( |Abramo et al. 2007| , |Creminelli et al. 2010| ). DE in such models can also contribute 
non-trivially to the gravitating mass of clusters. 



3.2.2 The nonlinear mass scale Mj^i^. 

The linear value of the collapse overdensity 6c is useful in predicting whether a given initial 
perturbation 5, «; 1 at initial zi collapses by some later redshift z. The collapse condition is 
simply (5,D+o(z) > 6ciz) and is used extensively to model the abundance and clustering of 
collapsed objects, as we discuss below in § 3.7. The distribution of peak amplitudes in the 
initial Gaussian overdensity field smoothed over mass scale M is given by a Gaussian PDF 
with a rms value of cr(M) (Equation 2). The peaks in the initial Gaussian overdensity field 
smoothed at redshift z, over mass scale M can be characterized by the ratio v = 6i/cr{M, z,) 
called the peak height. For a given mass scale M, the peaks collapsing at a given redshift z 
according to the spherical collapse model have the peak height given by: 

V = . (7) 

cr(M,z) ^ ^ 

Given that 5f (z) is a very weak function of z (changing by < 1 - 2% typically), whereas 
o"(M, z) = (t{M, z = 0)D+o(z) decreases strongly with increasing z, the peak height of 
collapsing objects of a given mass M increases rapidly with increasing redshift. 

Using Equation 7 we can define the characteristic mass scale for which a typical peak 
(v = 1) collapses at redshift z: 

cr(MNL, z) = cr(MNL, z - 0)D+o(z) = (5,(z). (8) 

This nonlinear mass, Mnl(z), is a key quantity in the self-similar models of structure for- 
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mation, which we consider in Section 3.9. 



3.3 Nonlinear collapse of real density peaks 

The spherical collapse model provides a useful approximate guideline for the time scale of 
halo collapse and has proven to be a very useful tool in developing approximate statistical 
models for the formation and evolution of halo populations. Such a simple model and its 
extensions (e.g., ellipsoidal collapse model) do, however, miss many important details and 
complexities of collapse of the real density peaks. Such complexities are usually explored 
using three-dimensional numerical cosmological simulations. Techniques and numerical 
details of such simulations are outside the scope of this review and we refer readers to 
rece nt reviews on this subject (^ertschinger 1998|, [Borgani & Kravtsov 2011|, D olag et al. 

orman 2010D . Here, we simply discuss the main features of gravitational collapse 
learned from analyses of such simulations. 

Figure 6 shows evolution of the DM density field in a cosmological simulation of a co- 
moving region of I5h~^ Mpc on a side around cluster mass-scale density peak in the initial 
perturbation field from z = 3 to the present epoch. The overall picture is quite different 
from the top-hat collapse. First of all, real peaks in the primordial field do not have the con- 
stant density or sharp boundary of the top-hat, but have a certain radial profile and curvature 



( pardeen et al. 1986| , palal et al. 2008| ). As a result, different regions of a peak collapse at 
different times so that the overall collapse is extended in time and the peak does not have 
a single collapse epoch (e.g., Diemand, Kuhlen & Madau 2007| ). Consequently, the distri- 



bution of matter around the collapsed peak can smoothly extend to several virial radii for 
late epochs and small masses ( |Cuesta et al. 2008 , Prada et al. 2006 ). This creates ambiguity 



about the definition of halo mass and results in a variety of mass definitions adopted in 
practice, as we discuss in Section 3.6. 

Second, the peaks in the smoothed density field, 5r(x), are not isolated but are surrounded 
by other peaks and density inhomogeneities. The tidal forces from the most massive and 
rarest peaks in the initial density field shepherd the surrounding matter into massive fil- 



amentary structures that connect them (Bond, Kofman & Pogosyan 1996). Accretion of 



matter onto clusters at late epochs occurs preferentially along such filaments, as can be 
clearly seen in Figure 6. 

Finally, the density distribution within the peaks in the actual density field is not smooth, 
as in the smoothed field (5r(x), but contains fluctuations on all scales. Collapse of density 
peaks on different scales can proceed almost simultaneously, especially during early stages 
of evolution in the CDM models when peaks undergoing collapse involve small scales, 
over which the power spectrum has an effective slope n ~ -3. Figure 6 shows that at 
high redshifts the proto-cluster region contains mostly small-mass collapsed objects, which 
merge to form a larger and larger virialized system near the center of the shown region at 
later epochs. Nonlinear interactions between smaller-scale peaks within a cluster-scale peak 
during mergers result in relaxation processes and energy exchange on different scales, and 
mass redistribution. Although the processes accompanying major mergers are not as violent 



as envisioned in the violent relaxation scenario (Valluri et al. 2007), such interactions lead 



to significant redistribution of mass (Kazantzidis, Zentner & Kravtsov 2006) and angular 



momentum (Vitvitska et al. 2002), both within and outside of the virial radius. 
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Fig. 6. Evolution of a dark matter density field in a comoving region of I5h Mpc on a side around 
cluster mass density peak in the initial perturbation field. The four panels, from top left to bot- 
tom right, show redshifts z = 3, z=l,z = 0.5 and z = 0. The forming cluster has a mass 
M200 - 1-2 X IO'^/i^'Mq at z = 0. The figure illustrates the complexities of the actual collapse of 
real density peaks: strong deviations from spherical symmetry, accretion of matter along filaments, 
and the presence of smaller-scale structure within the collapsing cluster-scale mass peak. 



3.4 Equilibrium 

Following the collapse, matter settles into an equilibrium configuration. For collisional 
baryonic component this configuration is approximately described by the hydrostatic equi- 
librium (HE hereafter) equation, in which the pressure gradient V;7(x) at point x is balanced 
by the gradient of local gravitational potential V(/>(x): V0(x) - -Vp{x)/pg{x), where Pg(x) 
is the gas density. Under the further assumption of spherical symmetry, the HE equation 
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can be written as p~^dp/dr = -GM{< r)/r^, where M(< r) is the mass contained within 
the radius r. Assuming the equation of state of ideal gas, p = pghsT/fimp where p is the 
mean molecular weight and nip is the proton mass; cluster mass within r can be expressed 
in terms of the density and temperature profiles, Pg{r) and T(r), as 



Mhe{< r) - 



Gpnip 



dlnpgir) d In T{r) 
dlnr J In r 



(9) 



Interestingly, the slopes of the gas density and temperature profiles that enter the above 
equation exhibit correlation that appears to be a dynamical attractor during cluster forma- 
tion ( Juncher, Hansen & Maccio 2012| ). 



For a collisionless system of particles, such as CDM, the condition of equilibrium is given 
by the Jeans equation (e.g., Binney & Tremaine 2008 ). For a non-rotating spherically sym- 
metric system, this equation can be written as 



rcrt 

Mj{< r) = 



d\nv{r) dlno-rirf 

+ + 2yS(r) 



dlnr 



dlnr 



(10) 



where p - I - ^ is the orbit anisotropy parameter defined in terms of the radial (cr,-) 
and tangential (cr,) velocity dispersion components (yS = for isotropic velocity field). 
We consider equilibrium density and velocity dispersion profiles, as well as anisotropy 
profile y6(r) in § 3.5.2. Equation 10 is also commonly used to describe the equilibrium 
of cluster galaxies. Although, in principle, galaxies in groups and clusters are not strictly 
collisionless, interactions between galaxies are relatively rare and the Jeans equation should 
be quite accurate. 

Note that the difference between equilibrium configuration of coUisional ICM and colli- 
sionless DM and galaxy systems is significant. In HE, the iso-density surfaces of the ICM 
should trace the iso-potential surfaces. The shape of the iso-potential surfaces in equilib- 
rium is always more spherical than the shape of the underlying mass distribution that gives 
rise to the potential. Given that the potential is dominated by DM at most of the cluster- 
centric radii, the ICM distribution (and consequently the X-ray isophotes and SZ maps) 
will be more spherical than the underlying DM distribution. 

As we noted in the previous section, the gravitational collapse of a halo is a process ex- 
tended in time. Consequently, a cluster may not reach complete equilibrium over the Hub- 
ble time due to ongoing accretion of matter and the occurrence of minor and major mergers. 
The ICM reaches equilibrium state following a major merger only after w 3 - 4 Gyr (e.g.. 



Nelson et al. 201 1[ ). Deviations from equilibrium affect observable properties of clusters 
and cause systematic errors when equations 9 and 10 are used to estimate cluster masses 
(e.g., Ameglio et al. 2009; Lau, Kravtsov & Nagai 2009|; Nagai, Kr avtsov & Vikhlinin 
2007; piffaretti & Valdarnini 2008|; |Rasia, Tormen & Moscardini 2004|). 



3.5 Internal structure of cluster halos 



Relaxations processes establish the equilibrium internal structure of clusters. Below we 
review our current undertstanding of the equilibrium radial density distribution, velocity 
dispersion, and triaxiality (shape) of the cluster DM halos. 
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3. 5. 1 Density Profile. 



Internal structure of collapsed halos may be expected to depend both on the properties of 



the initial density distribution around collapsing peaks (Hoffman & Shaham 1985) and on 
the p rocesses accompanying hierarchical collapse (e.g., pyer & White 1998 , Valluri et al. 
2007). The fact that simulations have demonstrated that the characteristic form of the spher- 
ically averaged density profile arising in CDM models, characterized by the logarithmic 



slope steepening with increasing radius (Dubinski & Carlberg 1991; Katz 1991; Navarro, 
Frenk & White 1995, 1996), is virtually independent of the shape of power spectrum and 
background cosmology (Sole & Lacey 199^; Huss, Jain & Steinmetz 1999b; ECatz 1991 



Navarro, Frenk & White 1997) is non trivial. Such a generic form of the profile also arises 
when small-scale structure is suppressed and the collapse is smooth, as is the case for halos 
forming at the cut-off scale of the power spectmm ( Diemand, Moore & Stadel 2005 ; Moore 



et al. 1999; |Wang & White 2009) or even from non-cosmological initial conditions (H uss, 
Jain & Steinmetz 1999a). 

The density profiles measured in dissipationless simulations are most commonly approx- 



imated by the "NFW" form proposed by [Navarro, Frenk & White (1995| ) based on their 
simulation of cluster formation: 



4p. 



xii + xy 



(11) 



where is the scale radius, at which the logarithmic slope of the profile is equal to -2 and 
P.V is the characteristic density at r = r^ . Overall, the slope of this profile varies with radius 
as dlnp/dlnr = -[1 -i- 2x/(l -i- x)], i.e., from the asymptotic slope of -1 at x «; 1 to -3 at 
X » 1, where the enclosed mass diverges logarithmically: M(< r) = M/^f{x)lf{cis), where 
Ma is the mass enclosing a given overdensity A, /(x) = ln(l -i- x) - x/( 1 -i- x) and ca = RA/r^ 
is the concentration parameter. Accurate formulae for the conversion of mass of the NFW 



halos defined for different values of A are given in the appendix of |Hu & Kravtsov (2003[ ). 



Subsequent simulations ( Graham et al. 2006| , fVIerritt et al. 2006| , [Navarro et al. 2004| ) 
showed that the pinasto (1965 ) profile and other similar models designed to describe de- 
projection of the Sersic profile ( Merritt et al. 2006| ) provide a more accurate description 
of the DM density profiles arising during cosmological halo collapse, as well as profiles of 



bulges and elliptical galaxies (Cardone, Piedipalumbo & Tortora 2005 ). The Einasto profile 
is characterized by the logarithmic slope that varies as a power law with radius: 



PE(r) ^ Ps exp 



a 



, X = r/r, , 



(12) 



where is again the scale radius at which the logarithmic slope is -2, but now for the 
Einasto profile, ps = PEifs), and a is an additional parameter that describes the power-law 
dependence of the logarithmic slope on radius: (ilnpE/<ilnr = -2x" . 

Note that unlike the NFW profile and several other profiles discussed in the literature, the 
Einasto profile does not have an asymptotic slope at small radii. The slope of the density 
profile becomes increasingly shallower at small radii at the rate controlled by a. The pa- 
rameter a varies with halo mass and redshift: at z = galaxy-sized halos are described 
by a w 0.16, whereas massive cluster halos are described hy a ~ 0.2 - 0.3; these values 
increase by ~ 0. 1 by z « 3 ( 3ao et al. 2008 ). Although a depends on mass and redshift (and 
thus also on the cosmology) in a non-trivial way, 3ao et al. (2008| , and see also Duffy et al 
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2008 ) showed that these dependencies can be captured as a universal dependence on the 
peak height v = 6c/cr{M,7.) (see Section 3.2.2 above): a = 0.0095v^ + 0.155. Finally, un- 
like the NFW profile, the total mass for the Einasto profile is finite due to the exponentially 
decreasing density at large radii. A number of useful expressions for the Einasto profile. 



such as mass within a radius, are provided by Cardone, Piedipalumbo & Tortora (2005[ ), 
Mamon & Lokas (2005| ), and praham et al. (2006D . 



The origin of the generic form of the density profile has recently been explored in detail by 



Lithwick & Dalai (2011), who show that it arises due to two main factors: (a) the density 
and triaxiality profile of the original peak and (b) approximately adiabatic contraction of 
the previously collapsed matter due to deepening of the potential well during continuing 
collapse. Without adiabatic contraction the profile resulting from the collapse would reflect 
the shape of the initial profile of the peak. For example, if the initial profile of mean linear 
overdensity within radius r around the peak can be described as 5l «: r~^, it can be shown 
that the resulting differential density profile after collapse without adiabatic contraction 



behaves as p(r) oc r ^, where g = 3y/{l + y) (Fillmore & Goldreich 1984). Typical profiles 



of initial density peaks are characterized by shallow slopes, y ~ - 0.3 at small radii, and 



very steep slopes at large radii (e.g., Dalai et al. 2008), which means that resulting profiles 



after collapse should have slopes varying from g x - 0.7 at small radii to g w 3 at large 
radii. 



However, Lithwick & Dalai (201 1) showed that contraction of particle orbits during subse- 
quent accretion of mass interior to a given radius r leads to a much more gradual change of 
logarithmic slope with radius, such that the regime within which g » - 0.7 is shifted to 
very small radii {rlr^-^ < 10~^), whereas at the radii typically resolved in cosmological sim- 
ulations the logarithmic slope is in the range of g w 1 - 3, so that the radial dependence of 
the logarithmic slope g{r) = dlnp/dlnr is in good qualitative agreement with simulation 
results. This contraction occurs because matter that is accreted by a halo at a given stage 
of its evolution can deposit matter over a wide range of radii, including small radii. The 
orbits of particles that accreted previously have to respond to the additional mass, and they 
do so by contracting. For example, for a purely spherical system in which mass is added 
slowly so that the adiabatic invariant is conserved, radii r of spherical shells must decrease 
to compensate an increase of M(< r). This model thus elegantly explains both the qualita- 
tive shape of density profiles observed in cosmological simulations and their universality. 
The latter can be expected because the contraction process crucial to shaping the form of 
the profile should operate under general collapse conditions, in which different shells of 
matter collapse at different times. 



Although the model of |!.ithwick & Dalai (2011 ) provides a solid physical picture of halo 



profile formation, it also neglects some of the processes that may affect details of the re- 
sulting density profile, most notably the effects of mergers. Indeed, major mergers lead to 
resonant dynamical heating of a certain fraction of collapsed matter due to the potential 
fluctuations and tidal forces that they induce. The amount of mass that is affected by such 



heating is significant (e.g., Valluri et al. 2007 ). In fact, up to ~ 40% of mass within the virial 



radii of merging halos may end up outside of the virial radius of the merger. This implies, 
for example, that virial mass is not additive in major mergers. Nevertheless, in practice the 
merger remnant retains the functional form of the density profiles of the merger progenitors 
( Kazantzidis, Zentner & Kravtsov 200^ , which means that major mergers do not lead to 



efficient violent relaxation. 

Although the functional form of the density profile arising during halo collapse is generic 
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for a wide variety of collapse conditions and models, initial conditions and cosmology do 
significantly affect the physical properties of halo profiles such as its characteristic density 



and scale radius ( [Navarro, Frenk & White 1997[ ). These dependencies are often discussed 
in terms of halo concentrations, ca = RA/rg. Simulations show that the scale radius is ap- 
proximate ly constant during late stages of halo evolution ([Bullock et al. 2001 , Wechsler 



et al. 2002), but evolves as = c^m Ra during early stages, when a halo quickly increases 



its mass through accretion and mergers (Zhao et al. 2009, ^003). The minimum value of 



concentration is Cmin = const w 3 - 4 for A = 200. For massive cluster halos, which are 
in the fast growth regime at any redshift, the concentrations are thus expected to stay ap- 
proximately constant with redshift or may even increase after reaching a minimum (Klypin, 
Trujillo-Gomez & Primack 2011; prada et al. 2012[ ). 



The characteristic time separating the two regimes can be identified as the formation epoch 
of halos. This time approximately determines the value of the scale radius and the subse- 
quent evolution of halo concentration. The initial conditions and cosmology determine the 
formation epoch and the typical mass accretion histories for halos of a given mass (B ullock 



et al. 2001; [Navarro, Frenk & White 1997[ ; [Zhao et al. 2009[ ), and therefore determine the 
halo concentrations. Although these dependences are non-trivial functions of halo mass and 
redshift, t hey can also be encapsulated by a universal function of the peak height v (P rada 
et al. 2012, [Zhao etal. 2009[). 



Baryon dissipation and feedback are expected to affect the density profiles of halos appre- 
ciably, although predictions for these effects are far less certain than predictions of the DM 
distribution in the purely dissipationless regime. The main effect is contraction of DM in 
response to the increasing depth of the central potential during baryon cooling and con- 
densation, which is often modelled under the assumption of slow contraction conserving 
adiabatic invariants of particle orbits (e.g., Barnes & White 1984 , Blumenthal et al. \9S(\ , 
Ryden & Gunn 1987[ , [Zeldovich et al. 198q )7 The standard model of such adiabatic con- 
traction assumes that DM particles are predominantly on circular orbits, and for each shell 
of DM at radius r t he product of the radius and the enclosed mass rM{r) is conserved (B lu- 
menthal et al. 1986). The model makes a number of simplifying assumptions and does not 
take into account effects of mergers. Nevertheless, it was shown to provide a reasonably 



accurate description of the results of cosmological simulations ( pnedin et al. 2004[ ). Its ac- 
curacy can be further improved by relaxing the assumption of circular orbits and adopting 
an empirical ansatz, in which the conserved quantity is rM{f), where f is the average radius 



along the particle orbit, instead of rM{r) ( Gnedin et al. 2004 ). At the same time, several 
recent studies showed that no single set of parameters of such simple models describes all 
obje cts that form in cosmological simulations equally well ([Abadi et al. 2010|; G nedin et al. 
2011; pustafsson, Fairbairn & Sommer-Larsen 2006 , Fissera et al. 2010| ). 



A more subtle but related effect is the increase of the overall concentration of DM within 
the virial radius of halos due to re-distribution of binding energy between DM and baryons 
during the process of cluster assembly ( Rudd, Zentner & Kravtsov 2008| ). The larger range 
of radii over which this effect operates makes it a potential worry for the precision con- 
strai nts from the cosmic shear power spectrum (|ing et al. 2006[; R udd, Zentner & Kravtsov 
2008). This effect depends primarily on the fraction of baryons that condense into the cen- 
tral halo galaxies and may be mitigated by the blow-out of gas by efficient AGN or SN 



feedback qvan Daalen et al. 201 1[ ). The effects of baryons on the overall concentration of 
mass distribution in clusters are thus uncertain, but can potentially increase halo concen- 
trati on and thereby significantly enhance the cross section for strong lensing (M ead et al. 
2010, puchwein et al. 2005 , Rozo et al. 2008| ) and affect statistics of strong lens distribution 
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in groups and clusters (e.g., More et al. 201 1 ) 



A number of studies have derived observational constraints on density profiles of clusters 
and t heir concentrations (|Buote et al. 2007; ^ttori et al. 2010|; Man delbaum, Seljak & Hirata 
2008; lOkabe et al. 2010; [Pointecouteau, Amaud & Pratt 2005|; [Schmidt & Allen 2007|; 



Sereno & Zitrin 2012| ; [Umetsu et al. 201 la| ,|b[ [Vikhlinin et al. 2006| ; IWojtak & Lokas 2010^ 
Although most of these studies find that the concentrations of galaxy clusters predicted by 
ACDM simulations are in the ballpark of values derived from observations, the agreement 
is not perfect and there is tension between model predictions and observations, which may 



be due to effects of baryon dissipation (e.g., [Fedeli 2012| ; |Rudd, Zentner & Kravtsov 2008| ) 



Some studies do find that the concordance cosmology predictions of the average cluster 
concentrations are somewhat lower than the average values derived from X-ray observa- 
tions dBuote et al. 2007i [Duffy et al. 2008t [Schmidt & Allen 2007| ). Moreover, lensing 
analyses indicate that the slope of the density profile in central regions of some clusters 
may be shallower than predicte d (^Mewman et al. 20Tl , 2009^ Sand et al. 2008 , 2004 ; Tyson, 
Kochanski & dell' Antonio 1998), whereas concentrations are considerably higher than both 
theoretical predictions and most other observational determinations from X-ray and WL 
analyses ( [Comerford & Natarajan 2007[ pguri et al. 2012[ , [2009| , [Zitrin et al. 201 1[ ). 



At this point, it is not clear whether these discrepancies imply serious challenges to the 
ACDM structure formation paradigm, unknown baryonic effects flattening the profiles in 
the c enters, or unaccounted systematics in the observational analyses (e.g., D alai & Keeton 
2003, Hennawi et al. 2007 ). When considering such comparisons, it is important to remem- 
ber that density profiles in cosmological simulations are always defined with respect to the 
center defined as the global density peak or potential minimum, whereas in observations 
the corresponding location is not as unambiguous as in simulations and the choice of center 
may affect the derived slope. 

It should be noted that improved theoretical predictions for cluster-sized systems generally 
predict larger concentrations for the most massive objects than do extrapolations of the 
conc entration-mass relations from smaller mass objects (B hattacharya, Habib & Heitmann 
2011; Prada et al. 2012 ; Zhao et al. 2009 ). In addition, as we noted above, the evolution 
predicted for the concentrations of these rarest objects is much weaker than c oc (I + z) 
found for smaller mass halos, so rescaling the concentrations of high-redshift clusters by 
(1 -I- z) factor, as is often done, could lead to an overestimate of their concentrations. 



3.5.2 Velocity dispersion profile and velocity anisotropy. 

Velocity dispersion profile is a halo property related to its density profile. Simulations show 
that this profile generally increases from the central value to a maximum at r w and 
slowly decreases outward (e.g.. Cole & Lacey 1996 ; f^asia, Tormen & Moscardini 2004 ). 



One remarkable result illustrating the close connection between density and velocity dis- 
persion is that for collapsed halos in dissipationless simulations the ratio of density to the 
cube of the rms velocity dispersion can be accurately described by a power law over at least 
three decades in radius (Taylor & Navarro 2001 ): Q{r) = p/cr^ cc r~" with a ~ 1.9. 



An important quantity underlying the measured velocity dispersion profile is the profile of 
the mean velocity, and the mean radial velocity, Vj-, in particular. For a spherically symmetric 
matter distribution in HE, we expect \\ - 0. Therefore, the profile of is a useful diagnostic 
of deviations from equilibrium at different radii. Simulations show that clusters at z = 
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generally have zero mean radial velocities within r x 7?^;^ and turn sharply negative between 
1 an d ~ 3/?vir, where density is dominated by matter infalling onto cluster (C ole & Lacey 
1996; puesta et al. 2008| ; |Eke, Navarro & Frenk 1998[ ). 



The distinguishing characteristic between gas and DM is the fact that gas has an isotropic 
velocity dispersion tensor on small scales, whereas DM in general does not. On large scales, 
however, both gas and DM may have velocity fields that are anisotropic. The degree of 
velocity anisotropy is commonly quantified by the anisotropy profile, yS(r) (see § 3.4). DM 
anisotropy is mild: p ~ O-O.l near the center and increases to y6 w 0.2 - 0.4 near the virial 



Lemze et al. 2011 



radius (Cole & Lacey 1996; Colin, Klypin & Kravtsov 200C ; Eke, Navarro & Frenk 1998 



Rasia, Tormen & Moscardini 2004). Interestingly, velocities exhibit 
substantial tangential anisotropy outside the virial radius in the infall region of clusters 
( Cuesta et al. 2008 , ^emze et al. 2011 ). Another interesting finding is that the velocity 



anisotropy correlates with the slope of the density profile (Hansen & Moore 2006), albeit 



with significant scatter (Lemze et al. 201 1) 



The gas component also has some residual motions driven by mergers and gas accre- 
tion along filam ents. Gas velocities tend to have tangential anisotropy (R asia, Tormen & 
Moscardini 2004), because radial motions are inhibited by the entropy profile, which is 
convectively stable in general. 



3.5.3 Shape. 



Although the density structure of mass distribution in clusters is most often described by 
spherically averaged profiles, clusters are thought to collapse from generally triaxial den- 
sity peaks ( Bardeen et al. 1986 , Doroshkevich 1970| ). The distribution of matter within 
halos result ing from hierarchical collapse is triaxial as well (Allgood et al. 2006 . Cole & 
Lacey 1996 , pubinski & Carlberg 1991i |Frenk et al. 1988|, Ifing & Suto 2002i Kasun & 
Evrard 2005, Warren et al. 1992 ), with triaxiality predicted by dissipationless simulations 



increasing with decreasing distance from halo center ( Allgood et al. 2006| ). Triaxiality of 
halo s decreases with decreasing mass and redshift ([Allgood et al. 2006 , Kasun & Evrard 
2005) in a way that again can be parameterized in a universal form as a function of peak 



height ( [Allgood et al. 2006| ). The major axis of the triaxial distribution of clusters is gener- 
ally aligned with the filament connecting a cluster with its nearest neighbor of comparable 



mass (e.g., |Lee et al. 2008[ , [West & Blakeslee 2000[ ), which reflects the fact that a significant 
fraction of mass and mergers is occurring along such filaments (e.g., Lee & Evrard 2007| , 
Onuora & Thomas 20001). 



Jing & Suto (2002) showed how the formalism of density distribution as a function of 
distance from cluster center can be extended to the density distribution in triaxial shells. 
Accounting for such triaxiality is particularly important in theoretical predictions and ob- 



servational analyse s of weak and strong lensing (Becker & Kravtsov 2011;_Clowe, De 
Lucia & King 20 04; [Corless & King 2007| ; palal & Keeton 2003| ; [Hennawi et al. 2007| ; 
Oguri et al. 2005 ). At the same time, it is important to keep in mind that, as with many 
other results derived mainly from dissipationless simulations, the physics of baryons may 
modify predictions substantially. 

The shape of the DM distribution in particular is quite sensitive to the degree of central 
concentration of mass. As baryons condense towards the center to form a central galaxy 
within a halo, the D M distribution becomes more spherical (pubinski 1994 ^Evrard, Sum- 
mers & Davis 1994; Kazantzidis et al. 2004; Tissera & Dominguez-Tenreiro 1998). The 
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effect increases with decreasing radius, but is substantial even at half of the virial radius 



( pCazantzidis et al. 2004[ ). The main mechanism behind this effect lies in adiabatic changes 
of the shapes of particle orbits in response to more centrally concentrated mass distribution 
after baryon dissipation ( pebattista et al. 2008 , pubinski 1994 ). 



In considering effects of triaxiality, it is important to remember that triaxiality of the hot 
intracluste r gas and DM distribution are different (gas is rounder, see, e.g., discussion in L au 
et al. 2011, and references therein). This is one of the reasons why mass proxies defined 
within spherical aperture using observable properties of gas (see § 4 below) exhibit small 
scatter and are much less sensitive to cluster orientation. 

The observed triaxiality of the ICM can be used as a probe of the shape of the underlying 
potential ( Lau et al. 201 1| ) and as a powerful diagnostic of the amount of dissipation that is 
occurring in cluster cores (Fang, Humphrey & Buote 2009) and of the mass of the central 
cluster galaxy ( |Lau et al. 2012 ). 



3.6 Mass definitions 



As we discussed above, the existence of a particular density contrast delineating a halo 
boundary is predicted only in the limited context of the spherical collapse of a density 
fluctuation with the top-hat profile (i.e., uniform density, sharp boundary). Collapse in such 
a case proceeds on the same time scale at all radii and the collapse time and "virial radius" 
are well defined. However, the peaks in the initial density field are not uniform in density, 
are not spherical, and do not have a sharp boundary. Existence of a density profile results 
in different times of collapse for different radial shells. Note also that even in the spherical 
collapse model the virial density contr ast formally applies only at the time of collapse; after 
a given density peak collapses its internal density stays constant while the reference (i.e., 
either the mean or critical) density changes merely due to cosmological expansion. The 
actual overdensity of the collapsed top-hat initial fluctuations will therefore grow larger 
than the initial virial overdensity at f > fcoiiapse ■ 

The triaxiality of the density peak makes the tidal effects of the surrounding mass distribu- 
tion important. Absence of a sharp boundary, along with the effects of non-uniform den- 
sity, triaxiality and nonlinear effects during the collapse of smaller scale fluctuations within 
each peak, result in a continuous, smooth outer density profile without a well-defined radial 
boundary. Although one can identify a radial range, outside of which a significant fraction 
of mass is still infalling, this range is fairly wide and does not correspond to a single well- 
defined radius dCuesta et al. 2008 ; Eke, Navarro & Frenk 1998 ). The boundary based on 



the virial density contrast is, thus, only loosely motivated by theoretical considerations. 

The absence of a well-defined boundary of collapsed objects makes the definition of the 
halo boundary and the associated enclosed mass ambiguous. This explains, at least partly, 
the existence of various halo boundary and mass definitions in the literature. Below we de- 
scribe the main two such definitions: the Friends-of-Friends (FoF) and spherical overden- 



sity (SO, see also [White 200 1| ). The FoF mass definition is used almost predominantly in 
analyses of cosmological simulations of cluster formation, whereas the SO halo definition 
is used both in observational and simulation analyses, as well as in analytic models, such 
as the Halo Occupation Distribution (HOD) model. Although other definitions of the halo 
mass are discussed, theoretical mass determinations often have to conform to the observa- 
tional definitions of mass. Thus, for example, although it is possible to define the entire 
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mass that will ever collapse onto a halo in simulations (Anderhalden & Diemand 2011 



Cuesta et al. 2008| ), it is impossible to measure this mass in observations, which makes it 
of interest only from the standpoint of the theoretical models of halo collapse. 



3.6.1 The Friends -of- Friends mass. 

Historically, the FoF algorithm was used to define groups and clusters of galaxies in obser- 
vations ( pinasto et al. 1984 , Huchra & Geller 1982 Press & Davis 1982) and was adopted 



to define c ollapsed objects in simulations of structure formation (pavis et al. 1985 , Einasto 



et al. 1984). The FoF algorithm considers two particles to be members of the same group 
(i.e., "friends"), if they are separated by a distance that is less than a given linking length. 
Friends of friends are considered to be members of a single group - the condition that gives 
the algorithm its name. The linking length, the only free parameter of the method, is usually 
defined in units of the mean interparticle separation: b = l/J, where I is the linking length 
in physical units and I = n~^^^ is the mean interparticle separation of particles with mean 
number density of n. 

Attractive features of the FoF algorithm are its simplicity (it has only one free parameter), 
a lack of any assumptions about the halo center, and the fact that it does not assume any 
particular halo shape. Therefore, it can better match the generally triaxial, complex mass 
distribution of halos forming in the hierarchical structure formation models. 

The main disadvantages of the FoF algorithm are the difficulty in theoretical interpretation 
of the FoF mass, and sensitivity of the FoF mass to numerical resolution and the pres- 
ence of substructure. For the smooth halos resolved with many particles the FoF algorithm 
with b = 0.2 defines the boundary corresponding to the fixed local density contrast of 
(5poF ~ 81.62 ( More et al. 201 1| ). Given that halos forming in hierarchical cosmologies have 



concentrations that depend on mass, redshift, and cosmology, the enclosed overdensity of 
the FoF halos also varies with mass, redshift and cosmology. Thus, for example, for the cur- 
rent concordance cosmology the FoF halos (defined with b = 0.2) of mass 10^^ - 10^^ Mq 
have enclosed overdensities of ~ 450 - 350 at z = and converge to overdensity of ~ 200 
at high redshifts where concentration reaches its minimum value of c w 3 - 4 (More et al. 
2011). For small particle numbers the boundary of the FoF halos becomes "fuzzier" and 
depends on the resolution (and so does the FoF mass). Simulations most often have fixed 
particle mass and the number of particles therefore changes with halo mass, which means 
that properties of the boundary and mass identified by the FoF are mass dependent. The 
presence of substructure in well-resolved halos further complicates the resolution and mass 



dependence of the FoF-identified halos ( [More et al. 201 1| ). Furthermore, it is well known 



that the FoF may spuriously join two neighboring distinct halos with overlapping volumes 
into a single group. The fraction of such neighbor halos that are "bridged" increases signif- 
icantly with increasing redshift. 



3.6.2 The Spherical Overdensity mass. 

The spherical overdensity algorithm defines the boundary of a halo as a sphere of radius 
enclosing a given density contrast A with respect to the reference density p. Unlike the FoF 
algorithm the definition of an SO halo also requires a definition of the halo center. The 
common choices for the center in theoretical analyses are the peak of density, the minimum 
of the potential, the position of the most bound particle, or, more rarely, the center of mass. 
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Given that the center and the boundary need to be found simuhaneously, an iterative scheme 
is used to identify the SO boundary around a given peak. The radius of the halo boundary, 
is defined by solving the implicit equation 

M(< r) - yAp(z)r3, (13) 

where M(< r) is the total mass profile and p(z) is the reference physical density at redshift 
z and r is in physical (not comoving) radius. 

The choice of A and the reference p may be motivated by theoretical considerations or by 
observational limitations. For example, one can choose to define the enclosed overdensity 
to be equal to the "virial" overdensity at collapse predicted by the spherical collapse model, 
Ap = Avir,cPcrit (see Section 3.2). Note that in + 1 cosmologies, there is a choice for 
reference density to be either the critical density Pcr(z) or the mean matter density Pm(z) and 
both are in common use. The overdensities defined with respect to these reference densities, 
which we denote here as A^ and A^, are related as A^ = l^cl^miT)- Note that OmCz) - 
HmoCl + z)^/E\z), where E{z) is given by Equation 4. For concordance cosmology, 1 - 
^^m(z) < 0.1 at z > 2 and the difference between the two definitions decreases at these 
high redshifts. In observations, the choice may simply be determined by the extent of the 
measured mass profile. Thus, masses derived from X-ray data under the assumption of 
HE are limited by the extent of the measured gas density and temperature profiles and are 
therefore often defined for the high values of overdensity: Ac = 2, 500 or Ac = 500. 

The crucial difference from the FoF algorithm is the fact that the SO definition forces a 
spherical boundary on the generally non-spherical mass definition. In addition, spheres 
corresponding to different halos may overlap, which means that a certain fraction of mass 
may be double counted (although in practice this fraction is very small, see, e.g., discussion 



in § 2.2 of Tinker etal. 2008) 



The advantage of the SO algorithm is the fact that the SO-defined mass can be measured 
both in simulations and observational analyses of clusters. In the latter the SO mass can be 
estimated from the total mass profile derived from the hydrostatic and Jeans equilibrium 
analysis for the ICM gas and galaxies, respectively (see Section 3.4 above), or gravitational 



lensing analyses (e.g., Hoekstra 2007 , Vikhlinin et al. 2006). Furthermore, suitable observ 



ables that correlate with the SO mass with scatter of 5 10% can be defined (see § 4 below), 
thus making this mass definition preferable in the cosmological interpretation of observed 
cluster populations. The small scatter shows that the effects of triaxiality is quite small in 
practice. Note, however, that the definition of the halo center in simulations and observa- 
tions may not necessarily be identical, because in observations the cluster center is usually 
defined at the position of the peak or the centroid of X-ray emission or SZ signal, or at the 
position of the BCG. 



3.7 Abundance of halos 



Contrasting predictions for the abundance and clustering of collapsed objects with the ob- 
served abundance and clustering of galaxies, groups, and clusters has been among the most 
pow erful validation tests of structure formation models (e.g., Blumenthal et al. 1984], K aiser 
1984, |1986t press & Schechter 1974D . 



Although real clusters are usually characterized by some quantity derived from observa- 
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tions (an observable), such as the X-ray luminosity, such quantities are generally harder 
to predict ab initio in theoretical models because they are sensitive to uncertain physical 
processes affecting the properties of cluster galaxies and intracluster gas. Therefore, the 
predictions for the abundance of collapsed objects are usually quantified as a function of 
their mass, i.e., in terms of the mass function dn{M,z) defined as the comoving volume 
number density of halos in the mass interval [M, M + dM] at a given redshift z. The pre- 
dicted mass function is then connected to the abundance of clusters as a function of an 
observable using a calibrated mass-observable relation (discussed in § 4 below). Below we 
review theoretical models for halo abundance and underlying reasons for its approximate 
universality. 



3.7.1 The mass function and its universality. 

The first statistical model for the abundance of collapsed objects as a function of their mass 



was developed by Press & Schechter (1974). The main powerful principle underlying this 
model is that the mass function of objects resulting from nonlinear collapse can be tied 
directly and uniquely to the statistical properties of the initial linear density contrast field 
Six). 

Statistically, one can define the probability F(M) that a given region within the initial over- 
density field smoothed on a mass scale M, (5m(x), will collapse into a halo of mass M or 
larger: 



F{M) 



p{5)Cco\\{S)d5, 



(14) 



where p{6)d6 is the PDF of 5m(x), which is given by Equation 2 for the Gaussian initial 
density field, and Ccoii is the probability that any given point x with local overdensity (5m(x) 
will actually collapse. The mass function can then be derived as a fraction of the total 
volume collapsing into halos of mass (M, M + dM), i.e., dF/dM, divided by the comoving 
volume within the initial density field occupied by each such halo, i.e., M/p: 



dn{M) 
dM 



Pm 
M 



dF 



dM 



(15) 



In their pioneering model, press & Schechter (1974 ) have adopted the ansatz motivated by 
the spherical collapse model (see § 3.1) that any point in space with (5m(x)D+o(z) > <^c will 
collapse into a halo of mass > M by redshift z: i.e., Ccoii(<5) = @(S - 6c), where is the 
Heaviside step function. Note that <5m(x) used above is not the actual initial overdensity, but 
the initial overdensity evolved to z = with the linear growth rate. One can easily check that 
for a Gaussian initial density field this assumption gives F(M) = ^erfc[6c/{ ^o-{M, z))] = 
F{v). This line of arguments and assumptions thus leads to an important conclusion that the 
abundance of halos of mass M at redshift z is a universal function of only their peak height 
v(M, z) = 6c/cr(M, z). In particular, the fraction of mass in halos per logarithmic interval of 
mass in such a model is: 



dn{M) 


_ Pm 


dF 


_ Pm 


dlnv dF 


_ Pm 


dlnv 


dlnM 


" M 


dlnM 


~ M 


dlnMdlnv 


~ M 


dlnM 



(16) 



Clearly, the shape (A(v) in such models is set by the assumptions of the collapse model. 
Numerical studies based on cosmological simulations have eventually revealed that the 
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Fig. 7. The function ipiv) defining the comoving abundance of collapsed halos via 
dn/dlnM - (p^/M)il/{v) as a function of v from different models and simulation-based calibra- 
tions. The upper panel shows deviations of specific models and calibrations for z = from T inker 
et al. (2010) based on a suite of ACDM cosmological simulations. 



shape (//psiv) predicted by the Press & Schechter (19741 model deviates by > 50% from the 
actu al shape measured in cosmological simulations (e.g.. pross et al. 199"8|, Jenkins e t al. 



2001, [Klypin et al. 1995| , |Lee & Shandarin 1999| , |Sheth & Tormen 1999^ [Tormen 1998D . 



A number of modifications to the original ansatz have been proposed, which result in i^(v) 
that more accurately describes simulation results. Such modifications are based on the col- 
lapse conditions that take into account asphericity of the peaks in the initial density field 
([\udit, Teyssier & Alimi 1997| ; pesjacques 2008| ; |Lee & Shandarin 1998| ; Monaco 1995| ; 
Sheth & Tormen 2002 ) and stochasticity due to the dependence of the collapse condition 
on peak properties othe r than v or shape (e.g., [Corasaniti & Achitouv 2011]; de Sim one, 
Maggiore & Riotto 201 1 ; pesjacques 2008| ; |Ma et al. 201 1| ; [Maggiore & Riotto 2010D . The 
more sophisticated excursion set models match the simulations more closely, albeit at the 
expense of more assumptions and parameters. There may be also inherent limitations in 
the accuracy of such models given that they rely on the strong assumption that one can pa- 
rameterize all the factors influencing collapse of any given point in the initial overdensity 
field in a relatively compact form. In the face of complications to a simple picture of peak 
collapse, as discussed in Section 3.3, one can indeed expect that the excursion set ansatze 
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Fig. 8. The square of the bias, iP'(y) as a function of peak height v corresponding to halos of mass 
M^nn m in the bias model based excursion set and spherical collapse barrier (dashed line M o & White 
1996), in the exc ursion set model based on model of ellipsoidal collapse (dot-dashed line Sheth, Mo 
& Tormen 2001), and the bias function calibrated using ACDM cosmological simulations (solid 
line Tinker et al. 2010 ) for SO halos defined using overdensity of A = 200 with respect to the mean 
density. Th e upper panel shows deviations of excursion set models from the calibration of T inker 
et al. (2010). 



are limited in how accurately they can ultimately describe the halo mass function. 



3.7.2 Calibrations of halo mass function in cosmological simulations. 

An alternative route to derive predictions for halo abundance accurately is to calibrate it 
using large cosmological simulations of structure formation. Simulations have generally 
confirmed the remarkable fact that the abundance of halos can be parameterized via a uni- 
vers al function of peak height v ( Bhattacharya et al. 201 1 , pourtin et al. 201 T ^Crocc e et al. 



2010, lEvrard et al. 2002|, [Jenkins et al. 2001|, |Lukic et al. 2007|, |Reed et al. 2007|, S heth 



& Tormen 1999, [Tinker et al. 2008| , [Warren et al. 2006| , [White 2002D . Note that in many 



studies the linear overdensity for collapse is assumed to be constant across redshifts and 
cosmologies and the mass function is therefore quantified as a function of cr~' - the quan- 
tity proportional to v. However, as pointed out by pourtin et al. (20lT ) it is necessary to 
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include the redshift and cosmology dependence of 5f(z) for an accurate description of the 
mass function across cosmologies. Even though dc varies only by ~ 1 - 2%, it enters into 
halo abundance via an exponent and such small variations can result in variations in the 
mass function of several per cent or more. 

The main efforts with simulations have thus been aimed at improving the accuracy of the 
(A(v) functional form, assessing systematic uncertainties related to the mass definition, and 
quantifying deviations from the universality of i/'(v) for different redshifts and cosmologies. 
The mass function, and especially its exponential tail, is sensitive to the specifics of halo 
mass definition, a point emphasized strongly in a number of studies ( Cohn & White 200^ ; 
Jenkins et al. 200 1|; pypin, Trujillo-Gomez & Primack 201 1|; [Tinker et al. 2008|; W hite 



2002). Thus, in precision cosmological analyses using an observed cluster abundance, care 
must be taken to ensure that the cluster mass definition matches that used in the calibration 
of the halo mass function. 

Predictions for the halo abundance as a function of the SO mass for a variety of overdensi- 
ties used to define the SO boundaries, accurate to better than ~ 5 - 10% over the redshift 
interval z = [0, 2], were presented by Finker et al. (2008). In Figure 7 we compare the form 
of the function (/'(v) calibrated through simulations by different researchers and compared 
to i//(v) predicted by the Press-Schechter model, and to the calibration of the functional 
form based on the ellipsoidal collapse ansatz by |Sheth, Mo & Tormen (2001 ). 



These calibrations of the mass function through N-body simulations provide the basis for 
the use of galaxy clusters as tools to constrain cosmological models through the growth rate 



of pe rturbations (see the recent reviews by A^llen, Evrard & Mantz 201 1 and W einberg et al. 
2012). As we discuss in Section 5 below, similar calibrations can be extended to models 
with non-Gaussian initial density field and models of modified gravity. 

Future cluster surveys promise to provide tight constraints on cosmological parameters, 
thanks to the large statistics of clusters with accurately inferred masses. The potential of 
such surveys clearly requires a precise calibration of the mass function, which currently 
represents a challenge. Deviations from universality at the level of up to ~ 10% have been 
reported (|Cohn & White 2008|. pourtin et al. 201 ij [Crocce et al.^OTOj |Lukic et al. 2007| , 
Reed et al. 2007 , Tinker et al. 2008 ). In principle, a precise calibration of the mass function 
is a challenging but tractable technical problem, as long as it only requires a large suite 
of dissipationless simulations for a given set of cosmological parameters, and an optimal 
interpolation procedure (e.g., pawrence et al. 20 TO ). 



A more serious challenge is the modelling of uncertain effects of baryon physics: baryon 
collapse, dissipation, and dynamical evolution, as well as feedback effects related to en- 
ergy release by the SNe and AGN, may lead to subtle redistribution of mass in halos. Such 
redistribution can affect halo mass and thereby halo mass function at the level of a few 



per cent ( pui et al. 20Tl| ; |Rudd, Zentner & Kravtsov 2008| ; [Stanek, Rudd & Evrard 2009[ ), 
although the exact magnitude of the effect is not yet certain due to uncertainties in our un- 
derstanding of the physics of galaxy formation in general, and the process of condensation 
and dynamical evolution in clusters in particular. 



3.8 Clustering of halos 



Galaxy clusters are clustered much more strongly than galaxies themselves. It is this strong 



clustering discovered in the early 1980s (Bahcall & Soneira 1983, Klypin & Kopylov 1983) 
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that led to the development of the concept of bias in the context of Gaussian initial density 



perturbation field ( [Kaiser 1984[ ). Linear bias of halos is the coefficient between the over- 
density of halos within a given sufficiently large region and the overdensity of matter in that 
region: 6^ = bd, with b defined as the bias parameter For the Gaussian perturbation fields, 



local linear bias is independent of scale ( [Scherrer & Weinberg 1998[ ), such that the large- 
scale power spectrum and correlation function on large scales can be expressed in terms of 
the corresponding quantities for the underlying matter distribution as Phh(^) = b^Pmmik) 



and ^hh{r) = b^mmir), respectively. As we discuss in Section 5, this is not true for non- 



Gaussian initial perturbation fields ( Dalai et al. 2008 ) or for models with scale-dependent 
linear growth rate ( parfrey, Hui & Sheth 201 1 1, in which cases the linear bias is generally 
scale-dependent. 

In the context of the hierarchical structure formation, halo bias is closely related to the 
overall abundance of halos discussed above, as illustrated by the "peak-background split" 



framework (Cole & Kaiser 1989, Kaiser 1984. Mo & White 1996, Sheth & Tormen 1999), 



in which the linear halo bias is obtained by considering a Lagrangian patch of volume Vq, 
mass Mo, and overdensity 6o at some early redshift zq. The bias is calculated by requiring 
that the abundance of collapsed density peaks within such a patch is described by the same 
function tfr{vp) as the mean abundance of halos in the Universe, but with peak height Vp 
appropriately rescaled with respect to the overdensity of the patch and relative to the rms 
fluctuations on the scale of the patch. Thus, the functional form of the bias dependence 
on halo mass, bh{M), depends on the functional form of the mass function explicitly in 
this framework. Simulations show that the peak-background split model provides a fairly 
accurate (to ~ 20%) prediction for the linear halo bias ( Tinker et al. 2010| ). 



Another line of argument illustrating the close connection between the halo abundance and 
bias is the fact that if one assumes that all of the mass is in the collapsed halos, as is done 
for example in the halo models ( Cooray & Sheth 2002| ), the requirement that matter in the 
Universe is not biased against itself implies that j b{v)g{v)d In v = 1 (e.g., Seljak 2000 ), 
where g{v) = \dlnv/d\nM\~^t//{v) (see eq. 16). This integral constraint requires that the 
form of the bias function b{v) is changed whenever ifr{v) changes. Incidentally, the close 
connection between b{v) and ifr{v) implies that if ifr{v) is a universal function, then the bias 
b{v) should be a universal function as well. 

The function b(v) recently calibrated for the SO-defined halos of different overdensities 
using a suite of large cosmological simulations with accuracy < 5% and satisfying the 
integral constraint ( Tinker et al. 2010| ) is shown in Figure 8 for halos defined using an 
overdensity of A = 200 with respect to the mean. This calibration of the bias is compared to 
the c orresponding prediction of the press & Schechter (1974 ) and the Sheth, Mo & Tormen 
(2001) ansatze. The figure shows that b{v) is a rather weak function of v at v < 1, but 
steepens substantially for rare peaks of y > 1. It also shows that the rarest clusters (y ~ 5) 
in the Universe can have the amplitude of the correlation function or power spectrum that 
is two orders of magnitude larger than the clustering amplitude of the galaxy-sized halos 
(V < 1). 



3.9 Self-similar evolution of galaxy clusters 

In the previous sections we have considered processes that govern the collapse of mat- 
ter during cluster formation, the transition to equilibrium and the equilibrium structure of 
matter distribution in collapsed halos. In the following sections, we consider baryonic pro- 
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cesses that shape the observable properties of clusters, such as their X-ray luminosity or the 
temperature of the ICM. However, before we delve into the complexities of such physical 
processes, it is instructive to introduce the simplest models based on assumptions of self- 
similarity, in which the number of control parameters is minimal. We discuss the assump- 
tions and predictions of the self-similar model in some detail because parametric scalings 
that it predicts are in wide use to interpret results from both cosmological simulations of 
cluster formation and observations. 



3.9.1 Self -similar model: assumptions and basic expectations. 

The self-similar model developed by [Kaiser (1986| ) makes three key assumptions. The first 
assumption is that clusters form via gravitational collapse from peaks in the initial density 
field in an Einstein-de-Sitter Universe, - 1- Gravitational collapse in such a Uni- 
verse is scale-free, or self-similar. The second assumption is that the amplitude of density 
fluctuations is a power-law function of their size, ^{k) oc This means that initial per- 
turbations also do not have a preferred scale (i.e., they are scale-free or self-similar). The 
third assumption is that the physical processes that shape the properties of forming clus- 
ters do not introduce new scales in the problem. With these assumptions the problem has 
only two control parameters: the normalization of the power spectrum of the initial density 
perturbations at an initial time, t^, and its slope, n. Properties of the density field and halo 
population at f > fj (or corresponding redshift z < Zi), such as typical halo masses that col- 
lapse or halo abundance as a function of mass, depend only on these two parameters. One 
can choose any suitable variable that depends on these two parameters as a characteristic 
variable for a given problem. For evolution of halos and their abundance, the commonly 
used choice is to define the characteristic nonlinear mass, Mnl (see Section 3.1), which 
encapsulates such dependence. The halo properties and halo abundance then become uni- 
versal functions of ju = M/Mnl in such model. Thus, for example, clusters with masses 
Mi(zi) and M2(z2) that correspond to the same ratio Mi(zi)/Mnl(zi) - A^2(z2)/^nl(z2) 
have the same dimensionless properties, such as gas fraction or concentration of their mass 
distribution. 

As we have discussed above, in more general cosmologies the halo properties and mass 
function should be universal functions of the peak height y, which encapsulates the de- 
pendence on the shape and normalization of the power spectra for general, non power-law 
shapes of the fluctuation spectrum. 



3.9.2 The Kaiser model for cluster scaling relations. 

In the this Section, we define cluster mass to be the mass within the sphere of radius R, 
encompassing characteristic density contrast. A, with respect to some reference density pr 
(usually either pcr or pm): M - {Anl3)KprR^ . In this definition, radius and mass are directly 
related and interchangeable. The model assumes spherical symmetry and that the ICM is in 
equilibrium within the cluster gravitational potential, so that the HE equation (eq. 9) holds. 
The mass M{< R) derived from the HE equation is proportional to T{R)R and the sum of 
the logarithmic slopes of the gas density and temperature profiles at R. In addition to the 
assumptions about self-similarity discussed above, a key assumption made in the model by 
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Kaiser (1986) is that these slopes are independent of M, so that 



M 1 2 

T OC — X {Ap,)3 M3 . (17) 
R 

Note that formally the quantity T appearing in the above equation is the temperature mea- 
sured at R, whereas some average temperature at smaller radii is usually measured in ob- 
servations. However, if we parameterize the temperature profile as T{r) = Ttf{x), where 
Tt is the characteristic temperature and f is the dimensionless profile as a function of 
dimensionless radius x = r/R, and we assume that f{x) is independent of M, any temper- 
ature averaged over the same fraction of radial range [x\,X2] will scale as oc r* oc T{R). 
The latter is not strictly true for the "spectroscopic" temperature, Tx, derived by fitting an 
obse rved X-ray spectrum to a single-temperature bremsstrahlung model (M azzotta et al. 



2004, Viklilinin 2006D , although in practice deviations of Tx from the expected behavior 



for are small. 

The gas mass within R can be computed by integrating over the gas density profile, which, 
by analogy with temperature, we parameterize as Pg(r) = pg^Pgix), where pg* is the char- 
acteristic density and pg is the dimensionless profile. The gas mass within R can then be 
expressed as 

Mg(< R) - 4;rpg,/?^ f x^pg{x)dx - 3M^/p„ oc M(< R). (18) 

Jo ^Pr 



The latter proportionality is assumed in the [Kaiser (1986| ) model, which means that pg* and 
/p„ are assumed to be independent of M. Note that pg* oc Ap^, so Apr does not enter the 
Mg-M relation. 

Using the scalings of Mg and T with mass, we can construct other cluster properties of 
interest, such as the luminosity of ICM emitted due to its radiative cooling. Assuming that 
the ICM emission is due to the free-free radiation and neglecting the weak logarithmic 
dependence of the Gaunt factor on temperature, the bolometric luminosity can be written 
as (e.g., [Sarazin 1986[ ): 



Lhoi X pIt~-V cc —^T-- cc A-b Mi . (19) 

We omit p,. in these equations for clarity; it suffices to remember that p,. enters into the scal- 
ing relations exactly as A. Note that the bolometric luminosity of a cluster is not observable 
directly, and the X-ray luminosity in soft band (e.g., 0.5 - 2 keV), Lxs, is frequently used. 
Such soft band X-ray luminosity is almost insensitive to temperature at T > 2 keV (Fabri- 
cant & Gorenstein 1983, as can be easily verified with a plasma emission code), so that its 
temperature dependence can be neglected. Lxs then scales as: 

Lxs^pIv ^-^^AM. (20) 

At temperatures T <2 keV temperature dependence is more complicated both for the bolo- 
metric and soft X-ray emissivity due to significant flux in emission lines. Therefore, strictly 
speaking, for lower mass systems the above L- M scaling relations are not applicable, and 
scaling of the emissivity with temperature needs to be calibrated separately taking also into 
account the ICM metallicity. The same is true for luminosity defined in some other energy 
band or for the bolometric luminosity. 
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Another quantity of interest is the ICM "entropy" defined in X-ray analyses as 



T^x^^ccpf^rocA-i/^M^/^. (21) 
nJ 



where is the electron number density. Finally, the quantity, Y = MgT where gas mass and 
temperature are measured within a certain range of radii scaled to R^, is used to characterize 
the ICM in the analyses of SZ and X-ray observations. This quantity is expected to be a 
particularly robust proxy of the cluster mass (e.g., ia Silva et al. 2004; f^abjan et al. 2011 



Kravtsov, Vikhlinin & Nagai 2006| ; |Mod et al. 2005| ; |Nagai 2006| , see also discussion in 



Section 4) because it is proportional to the global thermal energy of ICM. Using Equations 
18 and 17 the scaling of Y with mass in the self-similar model is 

Y = M„T oc A^I^M^I^. {22) 



Note that the redshift dependence in the normalization of the scaling relations introduced 
above is due solely to the particular SO definition of mass and associated redshift depen- 
dence of Apr. In Q.jn 4^ 1 cosmologies, there is a choice of either defining the mass relative 
to the mean matter density or critical density (Section 3.6.2). This specific, arbitrary choice 
determines the specific redshift dependence of the observable-mass relations. It is clear that 
this evolution due to A(z) factors has no deep physical meaning. However, the absence of 
any additional redshift dependence in the normalization of the scaling relations is just the 



consequence of the assumptions of the [Kaiser (1986| ) model and is a physical reflection of 
these assumptions. 

Extra evolution can, therefore, be expected if one or more of the assumptions of the self- 
similar model is violated. This can be due to either actual physical processes that break 
self-similarity or the fact that some of the model assumptions are not accurate. We dis- 
cuss physical processes that lead to the breaking of self-similarity in subsequent sections. 
Here below we first consider possible deviations that may arise because assumptions of the 
Kaiser model do not hold exactly, i.e. deviations not ascribed to physical processes that 
explicitly violate self-similarity. 

3. 9. 3 Extensions of the Kaiser model. 

Going back to equations 17 and 18, we note that the specific scaling of T oc M^^^ and 
Mg oc M will only hold, if the assumption that the dimensionless temperature and gas den- 
sity profiles, f{x) and Pg(x), are independent of M holds. In practice, however, some mass 
dependence of these profiles is expected. For example, if the concentration of the gas dis- 
tribution depends on mass similarly to the concentration of the DM profile (Ascasibar et al. 
2006), the weak mass dependence of DM concentration implies weak mass dependence of 
Pg(x) and t{x). Indeed, concentration depends on mass even in purely self-similar mod- 
els dCole & Lacey 1996 ; Navarro, Frenk & White 1997| ). These dependencies imply that 



predictions of the Kaiser model may not describe accurately even the purely self-similar 
evolution. This is evidenced by deviations of scaling relation evolution from these pre- 
dictions in hydrodynamical simulations of cluster formation even in the absence of any 



physical processes that can break self-similarity (e.g., [Nagai 2006| , [Stanek et al. 2010D 



In addition, the characteristic gas density, pg* , may be mildly modified by a mass-dependent, 
non self-similar process during some early stage of evolution. If such a process does not 
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introduce a pronounced mass scale and is confined to some early epochs (e.g., owing to 
shutting off of star formation in cluster galaxies due to AGN feedback and gas accre- 
tion suppression), then subsequent evolution may still be described well by the self-similar 
model. The Kaiser model is just the simplest specific example of a more general class of 
self-similar models, and can therefore be extended to take into account deviations described 
above. 

For simplicity, let us assume that the scalings of gas mass and gas mass fraction against 
total mass can be expressed as a power law of mass: 

M 

M,^C,M'^"^, f,^^=C^M"^^C^M"^"^. (23) 

This does not violate the self-similarity of the problem per se, as long as dimensionless 
properties of an object, such as /g, remain a function of only the dimensionless mass ju(z) = 
M/Mnl(z). This means that the normalization of the Mg - M relation must scale as Cg oc 
during the self-similar stages of evolution, such that 

Mg = CgoM^L' i^W^^"' = C^M^f^ . (24) 
Note that self-similarity requires that the slope org does not evolve with redshift. 

By analogy with the Mg - M relation, we can assume that the T — M relation can be well 
described by a power law of the form 

T = CtA5M5+'^, (25) 

where ax describes mild deviation from the scaling due, e.g., to mild dependence of gas 
and temperature profile slopes in the HE equation. The dimensionless quantities involving 
temperature T can be constructed using ratios of T with Tnl - (Mnip/kB)GMj^i^/Rj^i^ oc 
A'^^M^^j^. As before for the gas fraction, requirement that such dimensionless ratio depends 
only on yu requires Cj <x so that 

T = CtoMj^tA5M5+'^ = CtoA^mI/^"\ (26) 

Other observable-mass scaling relations can be constructed in the manner similar to the 
derivation of the original relations above. These are summarized below for the specific 
choice of Ap^ = AcPcr(z) h^E^(z): 



MgOcM:^'(z)M^^'*"^ (27) 
T oc Eizf^M^M^^^'^^'^, (28) 
Lbo, oc E(zy''M-l"^-"'%)M^^'''-^^-'f, (29) 

K oc E-^'\z)mII''"\z)Ma^'^^^-''^^^"\ (30) 
Y oc E{zfl\z)M-^^'"\z)Mt,^^l^^"^^"\ (31) 

In all of the relations one can, of course, recover the original relations by setting ag = aj = 



0. The observable-mass relations can be used to predict observable-observable relations by 
eliminating mass from the corresponding relations above. 



33 



3 UNDERSTANDING THE FORMATION OF GALAXY CLUSTERS 



Note that the evolution of scaUng relations in this extended model arises both from the 
redshift dependence of A(z)pr(z) and from the extra redshift dependence due to factors 
involving Mnl- The practical implication is that if measurements show that and/or 
ax ^ at some redshift, the original Kaiser scaling relations are not expected to describe 
the evolution, even if the evolution is self-similar. Instead, relations given by Equations 27- 
31 should be used. Note that at z w 0, observations indicate that within the radius rsoo 
enclosing overdensity = 500, ag w 0.1 - 0.2, while ax ~ 0. - 0.1. Therefore, the extra 
evolution compared to the Kaiser model predictions due to factors involving ag and, to a 
lesser degree, factors involving aj is expected. Such evolution, consistent with predictions 
of the above equations, is indeed observed both in simulations (see, e.g.. Fig. 10 in Vikhlinin 
et al. 2009b) and in observations ( pin et al. 2012 , although see Bohringer, Dolag & Chon 



2012) 



In practice, evolution of the scaling relations can be quite a bit more complicated than the 
evolution predicted by the above equations. The complication is not due to any deviation 
from self-similarity but rather due to specific mass definition and the fact that cluster for- 
mation is an extended process that is not characterized by a single collapse epoch. Some 
clusters evolve only mildly after their last major merger. However, the mass of such clusters 
will change with z even if their potential does not change, simply because mass definition is 
tied to a reference density that changes with expansion of the Universe and because density 
profiles of clusters extends smoothly well beyond the virial radius. Any observable prop- 
erty of clusters that has radial profile differing from the mass profile, but which is measured 
within the same /?a, will change differently than mass with redshift. As a simplistic toy 
model, consider a population of clusters that does not evolve from z = 1 to z = 0. Their 
X-ray luminosity is mostly due to the ICM in the central regions of clusters and it is not 
sensitive to the outer boundary of integration as long as it is sufficiently large. Thus, X-ray 
luminosity of such a non-evolving population will not change with z, but masses Ma of 
clusters will increase with decreasing z as the reference density used to define the cluster 
boundary decreases. Normalization of the Lx - Ma relation will thus decrease with de- 
creasing redshift simply due to the definition of mass. The strength of the evolution will 
be determined by the slope of the mass profile around 7?a, which is weakly dependent on 
mass. Such an effect may, thus, result in the evolution of both the slope and normalization 
of the relation. In this respect, quantities that have radial profiles most similar to the total 
mass profile (e.g., Mg, Y) will suffer the least from such spurious evolution. 

Finally, we note again that in principle for general non power-law initial perturbation spec- 
tra of the CDM models the scaling with M/Mnl needs to be replaced with scaling with 
the peak height v. For clusters within a limited mass range, however, the power spectrum 
can be approximated by a power law and thus a characteristic mass similar to Mnl can be 
constructed, although such mass should be within the typical mass range of the clusters. 
The latter is not true for Mnl, which is considerably smaller than typical cluster mass at all 
z. 



3.9.4 Practical implications for observational calibrations of scaling relations. 

In observational calibrations of the cluster scaling relations, it is often necessary to rescale 
between different redshifts either to bring results from the different z to a common redshift, 
or because the scaling relation is evaluated using clusters from a wide range of redshifts 
due to small sample size. It is customary to use predictions of the Kaiser model to carry 
out such rescaling to take into account the redshift dependence of A(z). In this context, one 
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should keep in mind that these predictions are approximate due to the approximate nature 
of some of the assumptions of the model, as discussed above. Inaccuracies introduced by 
such scalings may, for example, then be incorrectly interpreted as intrinsic scatter about the 
scaling relation. 

In addition, because the A(z) factors are a result of an arbitrary mass definition, they should 
not be interpreted as physically meaningful factors describing evolution of mass. For ex- 
ample, in the T-M relation, the A^''^ factor in Equation 17 arises due to the dimensional 
M/R factor of the HE equation. As such, this factor does not change even if the power- 
law index of the T-M relation deviates from 2/3, in which case the relation has the form 
T cc A^/^M^^^"'""'^. In other words, if one fits for the parameters of this relation, such as 
normalization A and slope B, using measurements of temperatures and masses for a sample 
of clusters spanning a range of redshifts, the proper parameterization of the fit should be 

— ^AA^'^ — , or—=AA^'^ — , (32) 

where Tp and Mp are appropriately chosen pivots. The parameterization T/Tp = A{A^^^M/Mp)^ 
that is sometimes adopted in observational analyses is not correct in the context of the self- 
similar model. In other words, only the observable quantities should be rescaled by the Apr 
factors, and not the mass. Likewise, only the Apr factors actually predicted by the Kaiser 
model should be present in the scalings. For example, no such factor is predicted for the 
Mg-M relation and therefore the gas and total masses of clusters at different redshifts should 
not be scaled by Apr factors in the fits of this relation. 

Finally, we note that observational calibrations of the observable-mass scaling relations 
generally depend on the distances to clusters and are therefore cosmology dependent. Such 
dependence arises because distances are used to convert observed angular scale to physical 
scale within which an "observable" is defined, R = OdAi'z) 6h~^, or to convert observed 
flux / to luminosity, L = An/dtiz), where <iA(z) and Jz.(z) = Ja(z)(1 + z)^ are the angular 
diameter distance and luminosity distance, respectively. Thus, if the total mass M of a 
cluster is measured using the HSE equation, we have Mhe °^ TR x dA h'^ ■ The same 
scaling is expected for the mass derived from the weak lensing shear profile measurements. 

If the gas mass is measured from the X-ray flux from a volume V oc R^ x 6^d^, which 
scales as / - Lx/i^ndl) x plV/df x M^/CV^) x Mll{edld\) and where / and 9 are 

observables, gas mass then scales with distance as Mo x d^d^J^ x h^l'^. This dependence 
can be exploited to constrain cosmological parameters, as in the case of X-ray measure- 
ments of gas f ractions in clusters (A^llen et al. 2008 , 2004^ [Ettori et al. 2009 ; Ettori, Tozzi 



& Rosati 2003; [LaRoque et al. 2006 ) or abundance evolution of clusters as a function of 



their observable (e.g., [Vikhlinin et al. 2009c[ ). In this respect, the Mg - M relation has the 



strongest scaling with distance and cosmology, whereas the scaling of the T-M relation 



is the weakest (e.g., see discussion by Vikhlinin et al. 2009b). 



3.10 Cluster formation and Thermodynamics of the Intra-cluster gas 



Gravity that drives the collapse of the initial large-scale density peaks affects not only the 
properties of the cluster DM halos, but also the thermodynamic properties of the intra- 
cluster plasma. The latter are also affected by processes related to galaxy formation, such 
as cooling and feedback. Below, we discuss the thermodynamic properties of the ICM 
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resulting from gravitational heating, radiative cooling, and stellar and AGN feedback during 
cluster formation. 



3. 10. 1 Gravitational collapse of the intra-cluster gas. 

The diffuse gas infalling onto the DM-dominated potential wells of clusters converts the 
kinetic energy acquired during the collapse into thermal energy via adiabatic compression 
and shocks. As gas settles into HE, its temperature approaches values close to the virial tem- 
perature corresponding to the cluster mass. In the spherically symmetric collapse model of 
Bertschinger (1985| ), supersonic accretion gives rise to the expanding shock at the inter- 



face of the inner hydrostatic gas with a cooler, adiabatically compressed, external medium. 
Real three-dimensional collapse of clusters is more complicated and exhibits large devia- 
tions from spherical symmetry, as accretion proceeds both in a quasi-spherical fashion from 
low-density regions and along relatively narrow filaments. The gas accreting along the lat- 
ter penetrates much deeper into the cluster virial region and does not undergo a shock at 
the virial radius (see Fig. 9). The strong shocks are driven not just by the accretion of gas 
from the outside but also "inside-out" during major mergers (e.g., [Poole et al. 2007 ). 



The shocks arising during cluster formation can be classified into two broad categories: 
strong external shocks surrounding filaments and the virialized regions of DM halos and 
weaker internal shocks, located within the cluster virial radius (e.g., Pfrommer et al. 200^ , 



Skillman et al. 2008[ , [Vazza et al. 2009| ). The strong shocks arise in the high-Mach number 
flows of the intergalactic gas, whereas weak shocks arise in the relatively low-Mach number 
flows of gas in filaments and accreting groups, which was pre-heated at earlier epochs by 
the strong shocks surrounding filaments or external groups. The left panel of Figure 9 shows 
these two types of shocks in a map of the shocked cells identified in a cosmological adaptive 
mes h refinement simulation of a region surrounding a galaxy cluster (from V azza et al. 
2009), along with the gas velocity field. This map highlights the strong external shocks, 
characterized by high Mach numbers M > 30, surrounding the cluster at several virial radii 
from the cluster center, and weaker internal shocks, with M~ 2-3. The cluster is shown 
at the epoch immediately following a major merger, which generated substantial velocities 
of gas within virial radius. As we discuss in § 4 below, incomplete thermalization of these 
gas motions is one of the main sources of non-thermal pressure support in the ICM. 

The right panel of Figure 9 shows the distribution of the kinetic energy processed by shocks, 
as a function of the local shock Mach number, for different redshifts ( $killman et al. 2008 ). 



The figure shows that a large fraction of the kinetic energy is processed by weak internal 
shocks and this fraction increases with decreasing redshift as more and more of the ac- 
creting gas is pre-heated in filaments. Yet, the left panel of Fig. 9 highlights that large- At 
shocks surround virialized halos in such a way that most gas particles accreted in a galaxy 
cluster must have experienced at least one strong shock in their past. 

Becasuse gravity does not have a characteristic length scale, we expect the predictions of 
the self-similar model, presented in Section 3.9, to apply when gravitational gas accretion 
determines the thermal properties of the ICM. The scaling relations and their evolution pre- 
dicted by the self-similar model are indeed broadly confirmed by the non-radiative hydro- 
dyna mical simulations that include only gravitational heating (e.g.. Eke, N avarro & Frenk 
1998; [Nagai, Kravtsov & VikhUnin 2007| ; [Navan'o, Frenk & White 199^ , although some 



small deviations arising due to small differences in the dynamics of baryons and DM were 
also found ( [Ascasibar et al. 2006| , [Nagai 2006| , [Stanek et al. 2010D . 
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Fig. 9. Left panel: map of the shocked cells identified by the divergence of velocity colored by the 
local shock Mach number and turbulent gas velocity eld (streamlines) in a slice of the simulation 
box 7.5 Mpc on a side and de pth of 1 8 kpc at z = 0.6, for a simulated cluster that reaches a mass of 
~ 2 X IQ^*Mq by z = (from Vazza et al. 2009). Right panel: redshift evolution of the distribution 
of the kinetic en ergy processed by sh ocks, as a function of the Mack number in a cosmological 
simulation (from [Skillman et al. 2008[). Resu lts shown in both panels are based on the adaptive mesh 
refinement ENZO code (D'Shea et al. 20041). 



As discussed in Section 2, observations carried out with the Chandra and XMM-Newton 
telescopes during the past decade showed that the outer regions of clusters (r > r25oo) 
exhibit self-similar scaling, whereas the core regions exhibit strong deviations from self- 
similarity. In particular, gas density in the core regions of small-mass clusters is lower than 
expected from self-similar scaling of large-mass systems. These results indicate that some 
additional non-gravitational processes are shaping properties of the ICM. We review some 
of these processes studied in cluster formation models below. 



3.10.2 Phenomenological pre-heating models. 



The first proposed mechanism to break self-similarity was high-redshift (z/,^ 3) pre-heating 
by non-gravitational sources of energy, presumably by a combined action of the AGN and 
stellar feedback ( Evrard & Henry 1991| , Kaiser 1991 ). The specific extra heating energy 
per unit mass, £"/,, defines the temperature scale T* oc E/JkB, such that clusters with virial 
temperature Tyir > T* should be left almost unaffected by the extra heating, whereas in 
smaller clusters with 7^/^ < T* gas accretion is suppressed. As a result, gas density is rela- 
tively lower in lower massive systems, especially at smaller radii, while their entropy will 
be higher. 
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Both analytical models (e.g. [Babul et al. 2002|, [Tozzi & Norman 2001^ |Voit et al. 2003 ) 
and hydrodynamical simulations (e.g. Bialek, Evrard & Mohr 2001 ; Borgani et al. 2002 ; 
Muanwong et al. 2002] ) have demonstrated that with a suitable pre-heating prescription and 
typical heating injection of £"/, ~ 0.5-1 keV per gas particle self-similarity can be bro- 
ken to the degree required to reproduce observed scaling relations. Studies of the possible 
feedback mechanisms show that such amounts of energy cannot be provided by SNe (e.g., 
Borgani et al. 2004], [Henning et al. 2009|, |Kay et al. 200% [Kravtsov & Yepes 2000|, R enzini 



2000 ), and must be injected by the AGN population (e.g., B ower, McCarthy & Benson 
2008; |Lapi, Cavaliere & Menci 2005| ; |Wu, Fabian & Nulsen 2000| ) or by some other un- 
known source. 



However, regardless of the actual sources of heating, strong widespread heating at high red- 
shifts would conflict with the observed statistical properties of the Lyman-or forest (Borgani 
& Viel 2009; Shang, Crotts & Haiman 2007). Moreover, hydrodynamical simulations have 
demonstrated that simple pre-heating models predict large isentropic cores (e.g., Borgani 
et al. 2005, Younger & Bryan 2007 ) and shallow pressure profiles ( Kay et al. 2012| ). This 
is at odds with the entropy and pressure profiles of real clusters which exhibit smoothly 
decli ning entropy down to r ~ 10-20 kpc (e.g., Amaud et al. 20101, C avagnolo et al. 
2009). 



3.10.3 The role of radiative cooling. 

The presence of galaxies in clusters and low levels of the ICM entropy in cluster cores are a 
testament that radiative cooling has operated during cluster formation in the past and is an 
important process shaping thermodynamics of the core gas at present. Therefore, in general 
radiative cooling cannot be neglected in realistic models of cluster formation. Given that 
cooling generally introduces new scales, it can break self-similarity of the ICM even in 



the absence of heating ( Voit & Bryan 2001 ). In particular, cooling removes low-entropy 



gas from the hot ICM phase in the cluster cores, which is replaced by higher entropy gas 
from larger radii. Somewhat paradoxically, the cooling thus leads to an entropy increase of 
the hot. X-ray emitting ICM phase. This eff"ect is illustrated in Figure 10, which shows the 
entropy maps in the simulations of the same cluster with and without cooling. In the absence 
of cooling (left panel), the innermost region of the cluster is filled by low-entropy gas. 
Merging substructures also carry low-entropy gas, which generates comet-like features by 
ram-pressure stripping, and is hardly mixed in the hotter ambient of the main halo. In the 
simulation with radiative cooling (right panel), most of the low-entropy gas associated with 
substructures and the central cluster region is absent, and most of the ICM has a relatively 
high entropy. 

A more quantitative analysis of the entropy distribution for these simulated clusters is 
shown in Figure 1 1 , in which the entropy profiles of clusters simulated with inclusion of 
different physical processes are compared with the baseline analytic spherical accretion 
mod el; this model predicts the power-law entropy profile K{r) oc r^-^ (e.g. T ozzi & Norman 



2001, [Voit 2005| ). The figure shows that the entropy profile in the simulation with radiative 
cooling is significantly higher than that of the non-radiative simulation. The difference in 
entropy is as large as an order of magnitude in the inner regions of the cluster and is greater 
by a factor of two even at rsoo. 

Interestingly, the predicted level of entropy at r ~ r25oo - ''soo in the simulations with 
cooling (but no significant heating) is consistent with the ICM entropy inferred from X- 
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Fig. 10. Maps of entropy in cosmological hydrodynamical simulations of a galaxy cluster of mass 
M500 - IO'^/i^'Mq at z = 0, carried out without (left panel) and with (right panel) radiative cool- 
ing. Brighter colors correspond to lower gas entropy. Each panel encompasses a physical scale of 
6.5/j^'Mpc, which corresponds to ^ 2.5 virial radii for this cluster T he simulations h ave been 
carried out using the GADGET-3 smoothed particle hydrodynamics code Springel (2005). 



ray observations. However, this agreement is likely to be spurious because it is achieved 
with the amount of cooling that results in conversion of w 40% of the baryon mass in 
clusters into stars and cold gas, which is inconsistent with observational measurements of 
cold fraction varying from - 20 - 30% for small-mass. X-ray-emitting clusters to £ 10% 
for massive clusters (see § 2). 

Finally, note that inclusion of cooling in simulations with pre-heating discussed above usu- 
ally results in problematic star-formation histories. In fact, if pre-heating takes place at a 
sufficiently high redshift, clusters exhibit excessive cooling at lower redshifts, as pre-heated 
gas collapses and cools at later epochs compared to the simulations without pre-heating 



(e.g. Tornatore et al. 2003 ). These results highlight the necessity to treat cooling and heat- 
ing processes simultaneously using heating prescriptions that can realistically reproduce 
the heating rate of the ICM gas as a function of cosmic time. We discuss efforts in this 
direction next. 



3.10.4 Thermodynamics of the intracluster medium with stellar and active galactic 
nuclei feedback. 

The results discussed above strongly indicate that, in order to reproduce the overall prop- 
erties of clusters, cooling should be modelled together with a realistic prescription for non- 
gravitational heating. This is particularly apparent in the cluster cores, where a steady heat- 
ing is required to offset the ongoing radiative cooling observed in the form of strong X-ray 



emission (see, e.g., Peterson & Fabian 2006). Studies of the feedback processes in clusters 
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Fig. 1 1 . Radial profiles of entropy (in units of kiloelectronvolt-centimeters squared) for the same 
simulations whose entropy maps are shown in Figure 10. Magenta dotted, long-dashed blue, con- 
tinuous red, and short-dashed green curves refer to the non-radiative simulation and to the three 
radiative simulations including only cooling and star formation, including also the effect of galactic 
ejecta from supernova, and including also the effect of AGN feedback, respectively. The dot-dashed 
line shows the power-law entropy profile with slope K(r) oc r' whereas the vertical dotted line 
marks the position of rgoo. 



is one of the frontiers in cluster formation modelling. Although we do not yet have a com- 
plete picture of the ICM heating, a number of interesting and promising results have been 
obtained. 

In Figure 1 1 , the solid line shows the effect of the SN feedback on the entropy profile. In 
these simulations, the kinetic feedback of SNe is included in the form of galactic winds car- 
rying the kinetic energy comparable to all of the energy released by Type-II SNe expected 
to occur according to star formation in the simulation. This energy partially compensates 
for the radiative losses in the central regions, which leads to a lower level of entropy in 
the core. However, the core ICM entropy in these simulations is still considerably higher 
than observed (e.g., [Sun et al. 2009 ). The inefficiency of the SN feedback in offsetting the 



coohng sufficiently is also evidenced by temperature profiles. 
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Figure 4 (from [Leccardi & Molendi 2008[ ) compares the observed temperature profiles of 
a sample of local clusters with results from simulations that include the SN feedback. The 
figure shows that simulations reproduce the observed temperature profile at 0.2ri8o- The 
overall shape of the profile at these large radii is reproduced by simulations including a 
wide range of physical processes, including non-radiative simulations (e.g., Borgani et al. 



2004; iLoken et al. 2002| ; |Nagai, Kravtsov & Vikhlinin 2007|). At large radii, however, the 



observed and predicted profiles do not match: The profiles in simulated clusters continue 
to increase to the smallest resolved radii, whereas the observed profiles reach a maximum 
temperature r^ax * so and then decrease with decreasing radius to temperatures of 
~ 0.1 - O.SJniax- The high temperatures of the central gas reflects its high entropy and is 
due to the processes affecting the entropy, as discussed above. 

Another indication that the SN feedback alone is insufficient is the fact that the stellar mass 
of the BCGs in simulations that include only the feedback from SNe is a factor of two to 
three larger than the observed stellar masses. For example, in the simulated clusters shown 
in figure 10, the baryon fraction in stars within rsoo decreases from - 40% in simulations 
without SN feedback to - 30%, which is still a factor of two larger than observational 
measurements. The overestimate of the stellar mass is reflected in the overestimate of the 



ICM metallicity in cluster cores (e.g., Borgani et al. 2008, and references therein). 



Different lines of evidence indicate that energy input from the AGN in the central cluster 
galaxies can provide most of the energy required to offset cooling (see McNamara & Nulsen 
2007, for a comprehensive review). Because the spatial and temporal scales resolved in cos- 
mological simulations are larger than those relevant for gas accretion and energy input, the 
AGN energy feedback can only be included via a phenomenological prescription. Such 
prescriptions generally model the feedback energy input rate by assuming the Bondi gas 
accretion rate onto the SMBHs, included as the sink particles, and incorporate a number of 
phenomenological parameters, such as the radiative efficiency and the feedback efficiency, 
which quantify the fraction of the radiated energy that thermally couples to the surround- 



ing gas (e.g., [Springel, Di Matteo & Hernquist 2005D . The values of these parameters are 



adjusted so that simulations reproduce the observed relation between black hole mass and 



the velocity dispersion of the host stellar bulge (e.g., Marconi & Hunt 2003). An alternative 



way of implementing the AGN energy injection is the AGN-driven winds, which shock and 



heat the surrounding gas (e.g., pubois et al. 201 1| , paspari et al. 2011| , pmma et al. 2004[ ). 



In general, simulations of galaxy clusters based on different variants of these models have 
shown that the AGN feedback can reduce star formation in massive cluster galaxies and re- 
duce the hot gas content in the poor clusters and groups, thereby improving agreement with 
the observed relation between X-ray luminosity and temperature (e.g., Puchwein, Sijacki 
& Springel 2008; [Sijacki et al. 2007[ ). Figure 12 (from fVfartizzi, Teyssier & Moore 2012| ) 



shows that simulations with the AGN feedback results in stellar masses of the BCGs that 
agree with the masses required to match observed stellar masses of galaxies and masses of 
their DM halos predicted by the models. The figure also shows that stellar masses are over- 
predicted in the simulations without the AGN feedback. Incidentally, the large-scale winds 
at high redshifts and stirring of the ICM in cluster cores by the AGN feedback also help to 
bring the metallicity profiles into cluster simulations in better agreement with observations 
( pabjan et al. 20iq , [McCarthy et al. 201^ ). 



Although results of simulations with the AGN feedback are promising, simulations so far 
have not been able to convincingly reproduce the observed thermal structure of cool cores. 
As an example. Figure 1 1 shows that the entropy profiles in such simulations still develop 
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Fig. 12. Comparison of tiie relation between stellar mass and total halo mass as predicted by cosmo- 
logical hydrod ynamical simulations of four early-type galaxies (symbols) (from M artizzi, Teyssier 
& Moore 2012). The open triangle and square refer to the simulations presented by N aab, Johansson 
& Ostriker (2009) and by peldmann et al. (2010[ ), both based on the smoothed particle hydrodynam- 
ics codes and not including AGN feedback. The filled symbols refer to the simulations by Martizzi, 
Teyssier & Moore (2012) with the brightest cluster galaxies forming at the center of a relatively 
poor cluster carried out with an AMR code, both including (triangle) and excluding (pentagon) 
AGN feedback. The red dotted line represents the relation expected for 2 0% efficiency in the con- 
version of baryons into stars. The solid black line is the prediction from [Vloster et al. (2010 ) of a 
model in which dark matter halos are populated with stars in such a way as to reproduce the ob- 
served stellar mass function. The grey shaded areas represent the 1-, 2- and 3-cr scatter around the 
average relation. 



large constant entropy core inconsistent with observed profiles. Interestingly, the adaptive 
mesh refinement simulations with jet-driven AGN feedback by pubois et al. (201 1| ) repro- 
duce the monotonically decreasing entropy profiles inferred from observations. However, 
such agreement only exists if radiative cooling does not account for the metallicity of the 
ICM; in simulations that take into account the metallicity dependence of the cooling rates 
the entropy profile still develosp a large constant entropy core. 

The presence of a population of relativistic particles in AGN-driven high-entropy bubbles 



has been suggested as a possible solution to this problem (Guo & Oh 2008, Sijacki et al. 
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2008). A relativistic plasma increases the pressure support available at a fixed tempera- 
ture and can, therefore, help to reproduce the observed temperature and entropy profiles in 
core regions. However, it remains to be seen whether the required population of the cos- 
mic rays is consistent with available constraints inferred from y and radio observations of 
clusters (e.g., Brunetti 2011, and references therein). A number of additional processes. 



such as thermal conduction (e.g., Narayan & Medvedev 200 1[ ) or dynamical friction heat- 
ing by galaxies ( ^1-Zant, Kim & Kamionkowski 2004 ) have been proposed. Generally, 
these processes cannot effectively regulate cooling in clusters by themselves (e.g., C onroy 
& Ostriker 2008, Dolag et al. 2004), but they may play an important role when operating 
in concert with the AGN feedback ( Voit 201 1 ) or instabilities in plasma (e.g., Sharma et al. 
2012). 

In summary, results of the theoretical studies discussed above indicate that the AGN energy 
feedback is the most likely energy source regulating the stellar masses of cluster galaxies 
throughout their evolution and suppressing cooling in cluster cores at low redshifts. The 
latter likely requires an interplay between the AGN feedback and a number of other phys- 
ical processes: e.g., injection of the cosmic rays in the high-entropy bubbles, buoyancy of 
these bubbles stabilized by magnetic fields, dissipation of their mechanical energy through 
turbulence, thermal conduction, and thermal instabilities. Although details of the interplay 
are not yet understood, it is clear that it must result in a robust self-regulating feedback cy- 
cle in which cooling immediately leads to the AGN activity that suppresses further cooling 
for a certain period of time. 



4. Regularity of the cluster populations 



Processes operating during cluster formation and evolution discussed in the previous sec- 
tion are complex and nonlinear. However, it is now also clear that most of the complexity 
is confined to cluster cores and affects a small fraction of volume and mass of the clus- 
ters. In this regime, clusters' observational properties exhibit strong deviations from the 



self-similar scalings described in Section 3.9 (see also |Voit 2005[ ). At larger radii, ICM is 
remarkably regular. In this section, we discuss the origins of such highly regular behaviour 
and the range of radii where it can be expected. We argue that the existence of this radial 
range allows us to define integral observational quantities, which have low scatter for clus- 
ters of a given mass that are not sensitive to the astrophysical processes operating during 
cluster formation and evolution. This fact is especially important for the current and fu- 



ture uses of clusters as cosmological probes (Allen, Evrard & Mantz 201 1 ; Weinberg et al. 
2012). 



4.1 Characterizing regularity 



A number of observational evidences, based on X-ray measurements of gas density (e.g. 
Croston et al. 2008 ) and temperature profiles ( (^.eccardi & Molendi 2008 , Pratt et al. 2007| , 
Vikhlinin et al. 2006 ), and the combination of the two in the form of entropy profile (C av- 



agnolo et al. 2009), demonstrate that clusters have a variety of behaviors in central regions, 
depending on the presence and prominence of cool cores. As discussed in Section 2, outside 
of core regions, clusters behave as a more homogeneous population and obey assumptions 
and expectations of the self-similar model (discussed above in 3.9.1). For instance. Figure 
3 shows that the ICM density is nearly independent of temperature once measured outside 
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of core regions ?'2500. at least for relatively hot systems with 3 keV. Quite remark- 
ably, observed and simulated temperature profiles agree with each other within this same 
radial range (see Figure 4). 

A good illustration of the regularity of the ICM properties is represented by the pressure 
profiles shown in Figure 13 (from Amaud et al. 2010[ , but see also [Sun et al. 2011 ) rescaled 



to the values of radius and pressure at rsoo- The perfectly regular, self-similar behavior 
would correspond to a single line in this plot for clusters of all masses. The pressure profiles 
shown in this figure are derived from X-ray observations and are defined as the product of 
electron number density and temperature profiles. Similar profiles are now derived from 



SZ observations, which probe pressure more directly (e.g., Bonamente et al. 2012). Quite 
remarkably, the observed pressure profiles at 0.2r5oo follow a nearly universal profile 
(see also [Nagai, Kravtsov & Vrkhlinin 2007| ), exhibiting fractional scatter of < 30% at 
r ~ 0.2r5oo and even smaller scatter of ~ 10 - 15% at r ~ O.Srsoo. At smaller radii the 
scatter of pressure profiles is much larger, with steep profiles corresponding to the cool 
core clusters and flatter profiles for disturbed clusters. Figure 13 shows that simulated and 
observed pressure profiles agree well with each other for 0.2r5oo, i.e., in the regime 
where the cluster population has a more regular behaviour. At smaller radii the profiles from 
simulations are on average steeper than observed, and exhibit a lower degree of diversity 
between cool core and non-cool core clusters. 

The scatter in the cluster radial profiles can be used to define the following three radial 
regimes. 

1. Cluster cores, r < r25oo> which exhibit the largest scatter and where scaling with mass 
differs significantly from the self-similar scaling expectation. We do not yet have a com- 
plete and adequate theoretical understanding of the observed properties of the ICM and 
their diversity in the cluster cores. This is one of the areas of active ongoing theoretical 
and observational research. 

2. Intermediate radii, r25oo ^ r < rjoo, which exhibit the smallest scatter and scaling with 
mass close to the self-similar scaling. Although the processes affecting thermodynamics 
of these regions are not yet fully understood, the simple scaling and regular behavior 
make observable properties of clusters at these radii useful for connecting them to the 
total cluster mass. 

3. Cluster outskirts r > rjoo, where scatter is increasing with radius and scaling with mass 
can be expected to be close to self-similar on theoretical grounds, but have not yet been 
constrained observationally. In this regime, clusters are dynamically younger, charac- 
terized by recent mergers, departures from equilibrium, and a significant degree of gas 
clumping. Significant progress is expected in the near future due to a combination of 
high-sensitivity SZ and X-ray observations using the next generation of instruments. 

The physical origin of the regular scaling with mass is the fact that cluster mass is the 
key control variable of cluster formation, which sets the amount of gas mass, the average 
temperature of the ICM, etc. It is important to note that the close to self-similar scaling with 
mass outside the cluster core does not imply that the non-gravitational physical processes 



are negligible in this regime. For instance, [Sun et al. (2009| ) showed that entropy measured 



at rjoo has a scaling with temperature quite close to the self-similar prediction, yet its 
level is higher than expected from a simple model in which only gravity determines the 
evolution of the intra-cluster baryons. This implies that whatever mechanism one invokes 
to account for such an entropy excess, it must operate in such a way as to not violate the 
self-similar scaling. The scatter around the mean profile exhibited by clusters at different 
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radii can be due to a number of factors. In particular, the small scatter at intermediate radii 
is a non-trivial fact, given that different clusters of the same mass are in different stages of 
their dynamical evolution and physical processes affecting their profiles may have operated 
differently due to different formation histories. 

One of the interesting implications of the small scatter in the pressure profiles is that it 
provides an upper limit on the contribution of non-thermal pressure support or, at least, 
on its cluster-by-cluster variation. A well-known source of non-thermal pressure is repre- 
sented by residual gas motions induced by mergers, galaxy motions, and gas inflow along 
large-scale filaments. Cosmological hydrodynamical simulations of cluster formation have 
been extensively used to quantify the pressure support contributed by gas motions and the 



corre sponding level of violation of HE (e.g., Ameglio et al. 2009^Biffi, D olag & Bohringer 
2011; iLau, Kravtsov & Nagai 2009|; |Nagai, Kravtsov & Vikhlinin 2007|; P iffaretti & Val- 



darnini 2008; |Rasia, Tormen & Moscardini 2004D . All these analyses consistently found 
that ICM velocity fields contribute a pressure support of about 5% to 15% per cent of 
the thermal one, the exact amount depending on the radial range considered (being larger 
at larger radii) and on the dynamical state of the clusters. Currently, there are only indi- 
rect indications for turbulent motions in the ICM of the real clusters from fluctuations of 
gas density measured in X-ray observations (e.g., phurazov et al. 2012|, S chuecker et al. 
2004). Direct measurements or upper limits on gas velocities and characterization of their 
statistical properties should be feasible with future high-resolution spectroscopic and po- 
lari metric instruments on the next-generation X-ray telescopes (e.g., I nogamov & Sunyaev 
2003, [Zhuravleva et al. 2010^ 



The galaxies and groups orbiting or inf ailing onto clusters not only stir the gas, but also 
make the ICM clumpier. The dense inner regions of clusters ram-pressure strip the gas on 
a fairly short time scale, so that the clumping is fairly small near cluster cores. However, 
it is substantial in the outskirts in cluster simulations where orbital times are longer and 
accretion of new galaxies and groups is ongoing. Given that the X-ray emissivity scales 
as the square of the local gas density, the dumpiness can bias the measurement of gas 
density from X-ray surface brightness profiles toward higher values if it is not accounted 
for. Because clumping is expected to increase with increasing cluster-centric radius, the 
inferred slope of gas density profiles can be underestimated, thus affecting the resulting 
pressure profile and hydrostatic mass estimates. Furthermore, gas clumping also affects X- 
ray temperature which is measured by fitting the X-ray spectrum to a single-temperature 
plasma model ( Mazzotta et al. 2004 , Vikhlinin 2006D . Clumping can therefore contribute 
to th e scatter of pressure profiles at large radii, especially at r > rspp (e.g., N agai & Lau 
2011)! 

Indirect detections of gas clumping through X-ray observations out to r2oo have been re- 
cently claimed, based on Suzaku observations of a flattening in the X-ray surface brightness 



profiles at such large radii ( [Simionescu et al. 201 1 ). However, these results are prone to sig- 
nificant systematic uncertainties ( Ettori & Molendi 2011 ). Independent analyses based on 
the ROSAT data (e.g. Eckert et al. 201 1 )) show the surface brightness profiles steepens 
beyond rsoo (see also Neumann 2005 ; Vikhlinin, Forman & Jones 1999| ), which is incon- 
sistent with the degree of gas clumping inferred from the Suzaku data, but consistent with 
predictions of hydrodynamical simulations. 

Clearly, the dumpiness of the ICM depends on a number of uncertain physical processes, 
such as efficient feedback, which removes gas from merging structures, or thermal conduc- 



tion, which homogenizes the ICM temperatures (e.g., Dolag et al. 2004). The degree of 



45 



4 REGULARITY OF THE CLUSTER POPULATIONS 





■ ■ I I I I I I I I— i—l I I I I I ■ ■ ■ I 

0.01 0.10 1.00 

Rodius (R500) 



Fig. 13. Comparison betwee n observed (black lin es) and simulated (red lines with orange shaded 
area) pressure profiles (from \rnaud et al. 2010 1. Observational data refer to the Rrepresentative 
XM M-Newton Cluster Structure Survey (REXCESS) sample of nearby clusters (B ohringer et al. 
2007) observed with XMM-Newton. Simulation results are obtained by combining different sets of 
clust ers simulated with b oth smoothed particle hydrodynamics and adaptive mesh refinement codes 
(see /Vrnaud et al. 2010 for details). The continuous red line corresponds to the average profile 
from simulations, after rescaling profiles according to the values of R^qq and M500 predicted by 
hydrostatic equilibrium, with the orange area showing the corresponding rms scatter The red dotted 
line shows the simulation results when using instead the true Msoo value. The bottom panel shows 
the ratio between average simulation profiles and average observed profiles. 
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gas clumping in density and temperature is therefore currently uncertain in both theoretical 
models and observations. Future high-sensitivity SZ observations of galaxy clusters with 
improved angular resolution will allow a direct measurement of projected pressure profiles. 
Their comparison with X-ray derived profiles will help in understanding the impact of gas 
clumping on the thermal complexity of the ICM. 

Additional non-thermal pressure support can be provided by the magnetic fields and rela- 
tivistic cosmic rays, the presence of which in the ICM is demonstrated by radio observations 
of the radio halos: diffuse and faint radio sources filling the central Mpc^ region of many 
galaxy clusters (e.g., piovannini et al. 2009 . Venturi et al. 2008 ) arising due to the syn- 



chrotron emission of highly relativistic electrons moving in the ICM magnetic fields. The 
origin of these relativistic particles still needs to be understood, although several models 
have been proposed. Shocks and turbulence associated with merger events are expected to 
compress and amplify magnetic fields and accelerate relativistic electrons (see, e.g., Ferrari 



et al. 2008 and Dolag, Bykov & Diaferio 2008 for reviews). Numerical simulations includ- 



ing injection of cosmic rays from accretion shocks and SN explosions (e.g., Pfrommer et al. 



2007, |Vazza et al. 2012| ) indicate that cosmic rays contribute a pressure support, which can 
be as high as ~ 10% for relaxed clusters and ~ 20% for unrelaxed clusters at the outskirts. 
At smaller radii, the pressure contribution of cosmic rays in these models becomes small 
(< 3% at r < O.lrvir), which is consistent with the upper limits from y-ray observations by 



the Fermi Gamma-ray Telescope (e.g., Ackermann et al. 2010) 



The role of intracluster magnetic fields have been investigated in a number of studies using 



cosmological simulations (see polag, Bykov & Diaferio 2008| , for a review). The general 
result is that pressure support from magnetic fields should be limited to $ 5%, which is 
consistent wit h observational constraints on the magnetic field strength (~ jiG) (e.g., G ov- 
oni et al. 2010, |Vogt & EnBlin 2005| rand' upper limits on the contribution of magnetic fields 



to non-thermal pressure support (e.g. |!.agana, de Souza & Keller 2010| ). 

As a summary, the scatter in cluster profiles in the cluster cores is mainly driven by differ- 
ences in the physical processes such as cooling and heating by AGN feedback and different 
merger activity that different clusters experienced during their evolution. At intermediate 
radii, the scatter is small because the ICM is generally in good HE within cluster gravita- 
tional potential and because processes that shaped its thermodynamic processes have not 
introduced new mass scale so that self-similar scaling is not broken. In the cluster outskirts, 
the scatter is expected to be driven by deviations from HE and other sources of non-thermal 
pressure support such as the cosmic rays, as well as by a rapid increase of ICM dumpiness 
with increasing radius. 



4.2 Scaling relations 

Existence of the radial range, where ICM properties scale with mass similarly to the self- 
similar expectation with a small scatter, implies that we can define integral observable 
quantities within this range that will obey tight scaling relations among themselves and 
with the total cluster mass. Furthermore, these scaling relations are also expected to be 
weakly sensitive to the cluster dynamical state, given that relaxed and unrelaxed clusters 
have similar profiles at these intermediate radii. Indeed, as we showed in § 2 (see Fig. 5), X- 
ray luminosity measured within the radial range [0.15-1] rsoo exhibits a tight scaling against 
the total ICM thermal content measured by the Yx parameter, with relaxed and unrelaxed 
clusters following the same relation. Here Yx is defined as the product of gas mass and 
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X-ray temperature, both measured within rsoo but like Lx temperature is measured after 
excising the contribution from r < O.lSrsoo- 

As discussed in § 3.9, the gas temperature T, gas mass Mgas and total thermal content of the 
ICM Y - MgasT, are commonly used examples of integral observational quantities whose 
scaling relations with cluster mass are predicted by the self-similar model and for which 
calibrations based on X-ray and SZ observations (or their combinations) and simulations 
are available. For example. Figure 14 shows the scaling relation between Yx, and M500 for 
simulated clusters and for a set of clusters with detailed Chandra observations from a study 
by Kravtsov, Vikhlinin & Nagai (2006| ), where Fx was introduced and defined specifically 
to use the temperature estimated only at O.lSrsoo < r < r^oo in order to minimize the scatter. 
The relation of Fx with M500 in simulations has scatter of only « 8% when both relaxed and 
unrelaxed clusters are included and evolution of its normalization with redshift is consistent 
with expectations of the self-similar model. The insensitivity of the relation the dynamical 
state of clusters is not trivial and is due to the fact that during mergers clusters move almost 



exactly along the relation (e.g., poole et al. 2007| , |Rasia et al. 201 In addition, the slope 
and normalization of the Fx - M500 relation is also not sensitive to specific assumptions 
in modelling cooling and feedback heating processes in simulations ( pabjan et al. 2011 , 
Stanek et al. 2010), which makes them more robust theoretically. 



The Ysz-M relation also exhibits a comparably low-scatter and the slope and evolution of 



normalization are close to the predictions of the self-similar model (da Silva et al. 2004, 



Motl et al. 2005), which is not surprising given the similarity between the Fx and the in- 
tegrated Fsz measured from SZ observations. Its normalization changes by up to 30-40% 
depending on the interplay between radiative cooling and feedback processes included in 
the simulations (e.g. Battaglia et al. 2011, Bonaldi et al. 2007, Nagai 200^ , and references 
therein). At the same time, simulation analysis is also shedding light on the effect of projec- 



tion (e.g. |Kay et al. 201 2D and mergers (e.g. [Krause et al. 201 2D on the scatter in the Ysz-M 
scaling. 

The tight relation of integral quantities such as Fx, Fsz, Mg, or core-excised X-ray luminos- 
ity with the total mass makes them good proxies for observational estimates of cluster mass, 
which can be used at high redshifts even with a relatively small number of X-ray photons. 
For instance, integral measurements of gas mass or temperature requires ~ 10^ photons, 
whic h is feasible for statistically complete cluster samples out to z ~ 1 (e.g., M antz et al. 
2010a, iMaughan 2007i [Vikhlinin et al. 2009aD or even beyond. This makes these integral 
quantities very useful as "mass proxies" in cosmological analyses of cluster populations 



(e.g., Allen, Evrard & Mantz 2011). Clearly, the relation of such mass proxies with the 
actual mass needs to be calibrated both via detailed observations of small controlled cluster 
samples and in cosmological simulations of cluster formation. 

The potential danger of relying on simulations for this calibration is that results could be 
sensitive to the details of the physical processes included. This implies that a mass proxy 
is required to have not only a low scatter in its scaling with mass, but also to be robust 
against changing the uncertain description of the ICM physics. As we noted above. Fx 
is quite robust to changes within a wide range of assumptions about cooling and heating 



processes affecting the ICM. This is illustrated in Figure 15 (taken from pabjan et al. 201 1| ) 
which shows how the normalization and slope of the scaling relation of gas mass and Yx 
versus M500 change with the physical processes included. The evolution of the Fx - M500 
relation with redshift is also consistent with self-similar expectations for different models of 
cooling and feedback. Other quantities, such as Mg, often exhibit a similar or even smaller 
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Fig. 14. The Fx-^5oo relation for a set of simulated clu sters at z = (circles) and for a sample 
of relaxed Chandra clusters from Vikhlinin et al. (2006) (stars with errorbars). Filled and open 
circles refer to simulated clusters which are classified as relaxed and unrelaxed, respectively. Core 
regions inside O.lSrsoo are excised in the measurement of the X-ray temperature entering into the 
computation of Yx, for both simulated and real clusters. True and hydrostatic masses are shown 
for simulated and observed clusters, respectively. The dot-dashed line shows the best-fit power-law 
relation for the simulated clusters with the slope fixed to the self-similar value of alpha -3/5. The 
dashed line shows the same best-fit power-law relation to simulations, but with the normalization 
scaled down by 15%, whic h takes into account the putative eff ect of hydrostatic mass bias due to 
residual gas motions. From Kravtsov, Vikhlinin & Nagai (2006). 



degree of scatter compared to Fx but are more sensitive to the choice of physical processes 
included in simulations. An additional practical consideration is that theoretical models 
should consider observables derived from mock observations of simulated clusters that take 



into account instrumental effects of detectors and projection effects (e.g., piffi et al. 2011 
Nagai, Vikhlinin & Kravtsov 2007| ; |Rasia et al. 2006| ). 



Ultimately, calibration of mass proxies for precision use should be obtained via indepen- 
dent observational mass measurements, using the weak lensing analysis, HE, or velocity 
dispersions of member galaxies. The combination of future large, wide-area X-ray, SZ, 
and optical/near-IR surveys should provide a significant progress in this direction. 
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Fig. 15. Sensitivity of different mass proxies on the physical description of the intracluster medium 
included i n cosmological hydrodynamical simulations for a set of galaxy clusters (from F abjan 
et al. 2011). Results for the scaling relation of M500 with gas mass Mgas and Yx are shown in the 
left and right panels, respectively. Best fitting normalization C and slope a of the scaling relations 
of these three mass proxies are shown in the upper and lower panels, respectively. Here, T^^/ is 
the mass-weighted temperature computed excluding the central cluster regions within O.lSrsoo. In 
the lower panels, the horizontal dashed lines mark the values of the slope of the scaling relations 
predicted by the self-similar model. Results are shown for simulations only including non-radiative 
hydrodynamics with standard (NR-SV) and reduced artical viscosity (NR-RV); cooling and star 
formation without (CSF) and with thermal conduction (CSF-C); cooling and star formation with 
metal enrichment, with (CSF-M-W) and without (CSF-M-NW) galactic winds from SN explosions; 
and cooling and star formations with the eect of AGN feedback (CSF-M-AGN) (see Fabjan et al. 
201 1 for further details). 



5. Cluster formation in alternative cosmological models 



In previous sections, we have discussed the main elements of cluster formation in the stan- 
dard ACDM cosmology. Although this model is very successful in explaining a wide va- 
riety of observations, some of its key assumptions and ingredients are not yet fully tested. 
This provides motivation to explore dilferent assumptions and alternative models. 

As discussed in Section 3.7, the halo mass function for a Gaussian random field is uniquely 
specified by the peak height v = dc/o-(R, z), where R is the filtering scale corresponding 
to the cluster mass scale M. For sufficiently large mass, that is rare peaks with y » 1, 
the mass function becomes exponentially sensitive to the value of v. At the same time, 
the mass function also determines the halo bias (see Section 3.8). Again, for v » 1 and 
Gaussian perturbations, the bias function scales as b{v) ~ 16c = v/cr{R, z). Therefore, the 
cluster 2-point correlation function can be written as ^c\ir) = v^{^R{r)/cr^), where ^^(r) is 
the correlation function of the smoothed fluctuation field (see Section 3.1). Once the peak 
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height y is constrained by requiring a model to predict the observed cluster abundance, the 
value of the cluster correlation function at a single scale r provides a measurement of the 
shape of the power spectrum through the ratio of the clustering strength at the scale r and at 
the cluster characteristic scale R. These predictions are only valid under two assumptions, 
namely Gaussianity of primordial density perturbations and scale independence of the lin- 
ear growth function D(z), as predicted by the standard theory of gravity. Therefore, the 
combination of number counts and large-scale clustering studies offers a powerful means 
to constrain the possible violation of either one of these two assumptions that hold for the 
ACDM model. 

In this section, we briefly review the specifics of cluster formation in models with non- 
Gaussian initial density field and with non-standard gravity, the most frequently discussed 
modifications to the standard structure formation paradigm. 



5.1 Mass function and bias of clusters in non-Gaussian models 



One of the key assumptions of the standard model of structure formation is that initial den- 
sity perturbations are described by a Gaussian random field (see Section 3.1). The simplest 
single-field, slow-roll inflation models predict nearly Gaussian initial density fields. How- 
ever, deviations from Gaussianity are expected in a broad range of inflation models that 
violate slow-roll approximation, and have multiple fields, or modified kinetic terms (see 



Bartolo et al. 2004| , for a review). Given that there is no single preferred inflation model, 
we do not know which specific form of non-Gaussianity is possibly realized in nature. De- 
viations from Gaussianity are parameterized using a heuristic functional form. One of the 
simplest and most common choices for such a form is the local non-Gaussian potential 
given by *Fng(x) = -((^g(x) -i- /nl[<^g(x)^ - ((^q)]), where *Fng is the usual Newtonian 
potential, cpQ is the Gaussian random field with zero mean, and the parameter /nl - const 
controls the degree and nat ure of non-Gaussianity (e.g., [Komatsu & Spergel 2001 ; Matar- 
rese, Verde & Ji menez 2000; [Salopek & Bond 1990 ). The simplest inflation models predict 
/nl ~ 10"^ (e.g., Maldacena 2003 ), but a number of models that predict much larger degree 
of non-Gaussianity exist as well (Bartolo et al. 2004 ). The current CMB constraint on scale- 
independent non-Gaussianity is /nl = 30 ± 20 at the 68% confidence level (e.g., Komatsu 
2010) and there is thus still room for existence of sizable deviations from Gaussianity. 

The non-Gaussian fields with /nl < have a PDF of the potential field that is skewed 
toward positive values and the abundance of peaks that seed the collapse of halos is re- 
duced compared to Gaussian initial conditions. Conversely, the PDF of the potential field 
in models with /nl > has negative skewness, and hence an increased number of potential 
minima (density peaks). This would result in an enhanced abundance of rare objects, such 
as m assive distant clusters, relative to the Gaussian case (see, e.g., figure 1 in D alai et al. 
2008, for an illustration of the effect of /nl on the large-scale structure that forms). The 
suppression or enhancement of abundance of halos increases with increasing peak height. 

The mass functions resulting from non-Gaussian initial conditions have been studied both 
anal ytically (e.g., [Afshordi & Tolley 2008|; |Chiu, Ostriker & Strauss 1998|; L o Verde et al. 
2008; VlataiTese, Verde & Jimenez 2000) and using cosmological simulations (D alai et al. 



2008 , lOrossi et al. 2007|, |:.o Verde et al. 2008i |Lo Verde & Smith 201 Ij W agner & Verde 
2012). These studies showed that accurate formulae for the halo abundance from the initial 
linear density field exist for the non-Gaussian models as well. The general result is that the 
fractional change in the abundance of the rarest peaks is of order unity for the initial fields 



51 



5 CLUSTER FORMATION IN ALTERNATIVE COSMOLOGICAL MODELS 



with I/nlI ~ 100. The abundance of clusters is thus only mildly sensitive to deviations of 
Gaussianit y within the currently constrained limits (|Cunha, Huterer & Pore 2010^ S artoris 



et al. 2010; Scoccimarro, Sefusatti & Zaldarriaga 2004] ; jefusatti et al. 2007 ). In contrast, 
primordial non-Gaussianity may also leave an imprint in the spatial distribution of clusters 
in the form of a scale-dependence of large-scale linear bias. 



As was discov ered by palal et al. (2008 ) and confirmed in subsequent analytical (A fshordi 



& Tolley 2008; fVIatarrese & Verde 2008 



McPonald 2008 



Koyama & Matsubara 2008) and numerical studies (Desjacques, Seljak & Iliev 2009|; 



Grossi etal. 2009; Pillepich, Porciani & Hahn 2010; Shandera, Palal & Huterer 201 1 ), the 



Slosar et al. 2008 



Taruya, 



linear bias of collapsed objects in the models with local non-Gaussianity can be described 
as a function of wavenumber k by Z^ng = + /nl x const/fc^, where be is the linear bias 
in the corresponding cosmological model with the Gaussian initial conditions discussed in 
§ 3.8. This scale dependence arises because in the non-Gaussian models the large-scale 
modes that boost the abundance of peaks are correlated with the peaks themselves, which 
enhances (or suppresses) the peak amplitudes by a factor proportional to /nl0 /nl^/^^- 
Because this effect of modulation increases with increasing peak height, v = 6c/cr{M,z), 
the scale-dependence of bias increases with increasing halo mass. This unique signature 
can be used as a powerful constraint on deviations from Gaussianity (at least for models 
with local non-Gaussianity) in large samples of clusters in which the power spectrum or 
correlation function can be measured on large scales. 



5.2 Formation of clusters in modified gravity models 



Recently, there has been a renewed interest in modifications to the standard GR theory of 
gravity (e.g., see [Capozziello & de Laurentis 20 if , Durrer & Maartens 2008 , Silvestri & 
Trodden 2009, for recent reviews). These models have implications not only for cosmic ex- 
pansion, but also for the evolution of density perturbations and, therefore, for the formation 
of galaxy clusters. 

For instance, in the class of the /(/?) models, cosmic acceleration arises from a modification 
of gravity law given by the addition of a general function /(/?) of the Ricci curvature scalar 



R in the Einstein-Hilbert action (see, e.g., Jain & Khoury 2010, Sotiriou & Faraoni 2010, for 
recent reviews). Such modifications result in enhancements of gravitational forces on scales 
relevant for structure formation in such a way that the resulting linear perturbation growth 
rate D becomes scale dependent; whereas on very large scales gravity behaves similarly 
to GR gravity, on smaller scales it is enhanced compared to GR and the rate of structure 
formation is thereby also enhanced. The nonlinear halo collapse and growth are also faster 
in f{R) mo dels, which leads to enhanced abundance of massive clusters (F erraro, Schmidt 
& Hu 2011; Schmidt et al. 2009 ; Zhao, Li & Koyama 2011 ) compared to the predictions 
of the models with GR gravity and identical cosmological parameters. Likewise, the peaks 
collapsing by a given z have lower peak height v in the modified gravity models compared 
to the peak height in the standard gravity model. This results in the reduced bias of clusters 
of a given mass compared to the standard model. Furthermore, the scale dependence of 
the lineal" growth also induces a scale dependence of bias, thus offering another route to 



detect modifications of gravity ( |Parfrey, Hui & Sheth 201 1| ). Qualitatively similar effects on 
cluster abundance and bias are expected in the braneworld-modified gravity models based 
on higher d imensions, such as the Pvali-Gabadadze-Porrati (PGP, Pvali, Gaba dadze & 
Porrati 2000) gravity model ([Khoury & Wyman 2009|; [Schafer & Koyama 2008|; Schmidt 



2009; ISchmidt, Hu & Lima 201 Oj ) aiidTts" successors with similar LSS phenomenology 
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Fig. 16. The potential of fut ure cluster X-ray sur veys to constrain deviations from Gaussian density 
perturbations (adapted from Sartoris et al. 2010 1. The figure shows constraints on the power-spec- 



trum normalization, erg, and non-Gaussianity parameter, /nl, expected from surveys of galaxy clus- 
ters to be carried out with the next-generation Wide Field X-ray Telescope. Dot-dashed blue curve 
and dashed green curve show the 68% confidence regions provided by the evolution of power spec- 
trum (PS) of the cluster distribution and cluster number counts (NC), respectively. The solid red 
ellipse shows the constraints obtained by combining number counts and power spectrum informa- 
tion. Cocmic Microwave Background Planck priors for Gaussian perturbations have been included 
in the analysis. 



consistent with current observational constraints, such as models of ghost-free massive 
gravity ( D'Amico et al. 2011 ; de Rham, Gabadadze & Tolley 2011 ). 



A general consequence of modifying gravity is that the Birkhoff theorem no longer holds, 
which does not allow a straightforward extension of the spherical collapse model described 
in Section 3.2 to a generic model of modified gravity. Nevertheless, numerical calculations 
of spherical collapse ha ve been presented for a number of specific models (e.g., M artino, 
Stabenau & Sheth 2009; jSchafer & Koyama 2008|; jSchmidt, Hu & Lima 2010|; Schmidt 



et al. 2009). For both the /(/?) and the DGP classes of models, the results of simulations 
obtained so far suggest that halo mass function and bias can still be described by the univer- 
sal functions of peak height, in which the threshold for collapse and the linear growth rate 
are modified appropriately from their standard model values ( [Schmidt, Hu & Lima 2010| ; 
Schmidt et al. 2009| ). This implies that it should be possible to calibrate mass function and 
bias of halos in the modified gravity models with the accuracy comparable to that in the 
standard structure formation models. 
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Fig. 17. The potential o f future cluster surveys to constrain deviations from General Relativity (from 
Vikhlinin et al. 2009d). The linear growth factor of density perturbations, G(z) - D(z) (not normal- 



ized to unity at z = 0), recovered from 2000 clusters, distributed in 20 redshift bins, each containing 
100 massive clusters, identified in a high-sensitivity X-ray cluster survey. The solid black line indi- 
cates the evolution of the linear growth factor for a ACDM model, whereas the dashed blue curve is 
the p rediction of a modified gravity model (the brane world model by D vali, Gabadadze & Porrati 
2000), having the same expansion history of the ACDM model. 



6. Summary and outlook 



All of the main elements of the overall narrative of how clusters form and evolve discussed 
in this review have been established over the past four decades. The remarkable progress in 
our understanding of cluster formation has been accompanied by great progress in multi- 
wavelength observations of clusters and our knowledge of the properties of the main mass 
constituents of clusters: stars, hot intracluster gas, and gravitationally dominant DM. 

Formation of galaxy clusters is a complicated, non-linear process accompanied by a host 
of physical phenomena on a wide range of scales. Yet, some aspects of clusters exhibit 
remarkable regularity, and their internal structure, abundance, and spatial distribution carry 
an indelible memory of the initial linear density perturbation field and the cosmic expansion 
history. This is manifested both by tight scaling relations between cluster propeities and the 
total mass, as well as by the approximate universality of the cluster mass function and bias. 
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when expressed as a function of the peak height v. 



Likewise, there is abundant observational evidence that complex processes - in the form 
of a non-linear, self-regulating cycle of gas cooling and accretion onto the SMBHs and as- 
sociated feedback - have been operating in the central regions of clusters. In addition, the 
ICM is stirred by continuing accretion of the intergalactic gas, motion of cluster galaxies, 
and AGN bubbles. Studies of cluster cores provide a unique window into the interplay be- 
tween the evolution of the most massive galaxies, taking place under extreme environmental 
conditions, and the physics of the diffuse hot baryons. At the same time, processes accom- 
panying galaxy formation also leave a mark on the ICM properties at larger radii. In these 
regions, the gas entropy measured from observations is considerably higher than predicted 
by simple models that do not include such processes, and the ICM is also significantly en- 
riched by heavy elements. This highlights that the ICM properties are the end product of 
the past interaction between the galaxy evolution processes and the intergalactic medium. 
Nevertheless, at intermediate radii, r25oo ^ r < rsoo, the scaling of the radial profiles of 
gas density, temperature, and pressure with the total mass is close to simple, self-similar 
expectations for clusters of sufficiently large mass (corresponding to > 2 - 3 keV). 
This implies that the baryon processes affecting the ICM during cluster formation do not 
introduce a new mass scale. Such regular behaviour of the ICM profiles provides a basis 
for the definition of integrated quantities, such as the core-excised X-ray luminosity and 
temperature, gas mass, or integrated pressure, which are tightly correlated with each other 
and with the total cluster mass. 

The low-scatter scaling relations are used to interpret abundance and spatial distribution of 



clusters and derive cosmological constraints (see Allen, Evrard & Mantz 2011 and Wein 



berg et al. 2012 for recent reviews). Currently, cluster counts measured at high redshifts 
provide interesting constraints on cosmological parameters complementary to other meth- 



ods (e.g., Mantz et al. 2010b, Rozo et al. 20 IC, Vikhhnin et al. 2009c) and a crucial test 



of th e entire class of ACDM and quintessence models (e.g., Benson et al. 201l|; J ee et al. 



20 1 1 ; Mortonson, Hu & Huterer 201 1| ). Although the statistical power of large future cluster 



surveys will put increasingly more stringent requirements on the theoretical uncertainties 
associated with cluster sc aling relations and mass function (|Cunha & Evrard 20101; W u, 
Zentner & Wechsler 2010), future cluster samples can provide competitive constraints on 
the non-Gaussianity in the initial density field and deviations from GR gravity. 

A combination of cluster abundance and large-scale clustering measurements can be used 
to derive stringent constraints on cosmological parameters and possible deviations from 
the standard ACDM paradigm. As an example. Figure 16 shows the constraints on the 



normalization of the power spectrum and the /nl parameter, (from Sartoris et al. 2010 1 
expected for a future high-sensitivity X-ray cluster survey. It shows that future cluster 
surveys can achieve a precision of crf^^^ w 5 - 10 (see also Cunha, Huterer & Pore 2010| ; 



Pillepich, Porciani & Reiprich 2012| ), thus complementing at smaller scales constraints 
on non-Gaussianity, which are to be provided on larger scales by observations of CMB 
anisotropics from the Planck satellite. 

Although a variety of methods will provide constraints on the equation of state of DE and 



other cosmological parameters (e.g., [Weinberg et al. 2012| ), clusters will remain one of the 



most powerful ways to probe deviations from the GR gravity (e.g., |Lombriser et al. 2009[ ). 
Even now, the strongest constraints on deviations from the GR on the Hubble horizon scales 
are derived from the combination of the measured redshift evolution of cluster number 



counts and geometrical probes of cosmic expansion ( Schmidt, Vikhlinin & Hu 2009| ). Fig 
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ure 17 illustrates the potential constraints on the linear rate of perturbation growth that can 
be derived from a future high-sensitivity X-ray cluster survey using similar analysis. The 
figure shows that a sample of about 2000 clusters at z < 2 with well-calibrated mass mea- 
surements would allow one to distinguish the standard ACDM model from a braneworld- 
modified gravity model with the identical expansion history at a high confidence level. 

The construction of such large, homogeneous samples of clusters will be aided in the next 
decade by a number of cluster surveys both in the optical/near-IR (e.g., DES, PanSTARRS, 
EUCLID) and X-ray (e.g., eROSITA, WFXT) bands. At the same time, the combination of 
higher resolution numerical simulations including more sophisticated treatment of galaxy 
formation processes and high-sensitivity multi-wavelength observations of clusters should 
help to unveil the nature of the physical processes driving the evolution of clusters and 
provide accurate calibrations of their masses. The cluster studies thus will remain a vibrant 
and fascinating area of modem cosmology for years to come. 
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